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[57] ABSTRACT 

A method and apparatus for excerpting and summarizing an 
undecoded document image, without first converting the 
document image to optical character codes such as ASQI 
text, identifies significant words, phrases and graphics in the 
document image using automatic or interactive morphologi- 
cal image recognition techniques, document summaries or 
indices are produced based on the identified significant 
portions of the document image. The disclosed method is 
particularly adept for improvement of reading machines for 
the blind. 
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METHOD AND APPARATUS FOR 
SUMMARIZING A DOCUMENT WITHOUT 
DOCUMENT IMAGE DECODING 

This is a continuation of application Ser. No. 07/794,543 5 
filed Nov. 19, 1991, now abandoned. 

BACKGROUND OF THE INVENTION 

A portion of the disclosure of this patent document 10 
contains material that is subject to copyright protection. The 
copyright owner has no objection to the facsimile reproduc- 
tion by anyone of the patent document or the patent disclo- 
sure, as it appears in the U.S. Patent and Trademark Office . 
records, but otherwise reserves all copyright rights whatso- 15 
ever. 

1. Cross-References to Related Applications 

The following concurrently filed and related U.S. appli- 
cations are hereby cross referenced and incorporated by 2Q 
reference in their entirety. 

"Method for Detcrrriining Boundaries of Words in Text" 
to Huttenlocher et al., U.S. patent application Ser. No. 
07/794,392. 

detecting Function Words Without Converting a Docu- 25 
ment to Character Codes" to Bloomberg et al., U.S. patent 
application Ser. No. 07/794,190. 

"A Method of Deriving Wordshapes for Subsequent Com- 
parison" to Huttenlocher et al., U.S. patent application Ser. 
No. 07/794,391. 

'Method and Apparatus for Determining the Frequency of 
Words in a Document Without Document Image Decoding" 
to Cass et al., U.S. patent application Sen No. 07/795,173. 

"Optical Word Recognition by Exarnination of Word 35 
Shape" to Huttenlocher et al., U.S. patent application Ser. 
No. 07/796,119, Published European Application No. 
0543592, published May 26, 1993. 

"A Method and Apparatus for Automatic Modification of 
Selected Semantically Significant Image Segments Within a 40 
Document Without Document Image Decoding" to Hutten- 
locher et al., U.S. patent application Sen No. 07/795,174. 

'Method for Comparing Word Shapes" to Huttenlocher et 
al., U.S. patent application Ser. No. 07/795,169, 

"Method and Apparatus for Deterrnining the Frequency of 45 
Phrase in a Document Without Document Image Decoding" 
to Withgott et al., U.S. patent application Sen No. 07/794, 
555 now U.S. Pat. No. 5,369,714. 

2. Field of the Invention 

This invention relates to improvements in methods and 
apparatuses for automatic document processing, and more 
particularly to improvements in methods and apparatuses for 
recognizing semantically significant words, characters, 
images, or image segments in a document image without 55 
first decoding the document image and automatically creat- 
ing a summary version of the document contents. 

3. Background 

It has long been the goal in computer based electronic 
document processing to be able, easily and reliably, to 60 
identify, access and extract information contained in elec- 
tronically encoded data representing documents; and to 
summarize and characterize the information contained in a 
document or corpus of documents which has been electroni- 
cally stored. For example, to facilitate review and evaluation 65 
of the information content of a document or corpus of 
documents to determine the relevance of same for a par- 
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ticular user's needs, it is desirable to be able to identify the 
semantically most significant portions of a document, in 
terms of the information they contain; and to be able to 
present those portions in a manner which facilitates the 
user's recognition and appreciation of the document con- 
tents. However, the problem of identifying the significant 
portions within a document is particularly difficult when 
dealing with images of the documents (bitmap image data), 
rather than with code representations thereof (e.g., coded 
representations of text such as ASCII). As opposed to ASCII 
text files, which permit users to perform operations such as 
Boolean algebraic key word searches in order to locate text 
of interest, electronic documents which have been produced 
by scanning an original without decoding to produce docu- 
ment images are difficult to evaluate without exhaustive 
viewing of each document image, or without hand-crafting 
a summary of the document for search purposes. Of course, 
document viewing or creation of a document summary 
require extensive human effort. 

On the other hand, current image recognition methods, 
particularly involving textual material, generally involve 
dividing an image segment to be analyzed into individual 
characters which are then deciphered or decoded and 
matched to characters in a character library. One general 
class of such methods includes optical character recognition 
(OCR) techniques. Typically, OCR techniques enable a 
word to be recognized only after each of the individual 
characters of the word have been decoded, and a correspond- 
ing word image retrieved from a library. 

Moreover, optical character recognition decoding opera- 
tions generally require extensive computational effort, gen- 
erally have a non-trivial degree of recognition error, and 
often require significant amounts of time for image process- 
ing, especially with regard to word recognition. Each bitmap 
of a character must be distinguished from its neighbors, its 
appearance analyzed, and identified in a decision making 
process as a distinct character in a predetermined set of 
characters. Further, the image quality of the original docu- 
ment and noise inherent in the generation of a scanned image 
contribute to uncertainty regarding the actual appearance of 
the bitmap for a character. Most character identifying pro- 
cesses assume that a character is an independent set of 
connected pixels. When this assumption fails due to the 
quality of die image, identification also fails. 

4. References 

European patent application number 0-361-464 by Doi 
describes a method and apparatus for producing an abstract 
of a document with correct meaning precisely indicative of 
the content of the document. The method includes listing 
hint words which are preselected words indicative of the 
presence of significant phrases that can reflect content of the 
document, searching all the hint words in the document, 
extracting sentences of the document in which any one of the 
listed hint words is found by the search, and producing an 
abstract of the document by juxtaposing the extracted sen- 
tences. Where the number of hint words produces a lengthy 
excerpt, a morphological language analysis of the abstracted 
sentences is performed to delete unnecessary phrases and 
focus on the phrases using the hint words as the right part of 
speech according to a dictionary containing the hint words. 

"A Business Intelligence System" by Luhn, IBM Journal, 
October 1958 describes a system which in part, auto- 
abstracts a document, by ascertaining the most frequently 
occurring words (significant words) and analyzes all sen- 
tences in the text containing such words. A relative value of 
the sentence significance is then established by a formula 
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which reflects the number of significant words contained in 
a sentence and the proximity of these words to each other 
within the sentence. Several sentences which rank highest in 
value of significance are then extracted from the text to 
constitute the auto-abstract 



SUMMARY OF THE INVENTION 

Accordingly, it is an object of the invention to provide a 
method and apparatus for automatically excerpting and 
summarizing a document image without decoding or other- 
wise understanding the contents thereof. 

It is another object of the invention to provide a method 
and apparatus for automatically generating ancillary docu- 
ment images reflective of the contents of an entire primary 
document image. 

It is another object of the invention to provide a method 
and apparatus of the type described for automatically 
extracting summaries of material and providing links from 
the summary back to the original document. 

It is another object of the invention to provide a method 
and apparatus of the type described for producing Braille 
document summaries or speech synthesized summaries of a 
document. 

It is another object of the invention to provide a method 
and apparatus of the type described which is useful for 
enabling document browsing through the development of 
image gists, or for document categorization through the use 
of lexical gists. 

It is another object of the invention to provide a method 
and apparatus of the type described that does not depend 
upon statistical properties of large, pre-analyzed document 
corpora. 

The invention provides a method and apparatus for seg- 
menting an undecoded document image into undecoded 
image units, identifying semantically significant image units 
based on an evaluation of predetermined image character- 
istics of the image units, without decoding the document ^ 
image or reference to decoded image data, and utilizing the 
identified significant image units to create an ancillary 
document image of abbreviated information content which 
is reflective of the subject matter content of the original 
document image. In accordance with one aspect of the 
invention, the ancillary document image is a condensation or 
summarization of the original document image which facili- 
tates browsing. In accordance with another aspect of the 
invention, the identified significant image units are pre- 
sented as an index of key words, which may be in decoded 
form, to permit document categorization. 

Thus, in accordance with one aspect of the invention, a 
method is presented for excerpting information from a 
document image containing word image units. According to 
the invention, the document image is segmented into word 
image units (word units), and the word units are evaluated 
in accordance with morphological image properties of the 
word units, such as word shape. Significant word units are 
then identified, in accordance with one or more predeter- 
mined or user selected significance criteria, and the identi- 
fied significant word units are outputted. 

In accordance with another aspect of the invention, an 
apparatus is provided for excerpting information from a 
document containing a word unit text. The apparatus 
includes an input means for inputting the document and 
producing a document image electronic representation of the 
document, and a data processing system for performing data 



driven processing and which comprises execution process- 
ing means for performing functions by executing program 
instructions in a predetermined manner contained in a 
memory means. The program instructions operate the execu- 
tion processing means to identify significant word units in 
accordance with a predetermined significance criteria from 
morphological properties of the word units, and to output 
selected ones of the identified significant word units. The 
output of the selected significant word units can be to an 
elcctrostatographic reproduction machine, a speech synthe- 
sizer means, a Braille printer, a bitmap display, or other 
appropriate output means. 

These and other objects, features and advantages of the 
invention will be apparent to those skilled in the art from the 
following detailed diescription of the invention, when read in 
conjunction with the accompanying drawings and appended 
claims. 
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BRIEF DESCRIPTION OF THE DRAWINGS 

A preferred embodiment of the invention is illustrated in 
the accompanying drawing, in which: 

FIG. 1 is a flow chart of a method of the invention; 

FIG. 2 is a block diagram of an apparatus according to the 
invention for carrying out the method of FIG. 1; 

FIG. 3 is a flow chart of a preferred embodiment of a 
method according to the invention for detecting function 
words in a scanned document image without first converting 
the document image to character codes; 

FIGS. 4A-4F show three sets of character ascender struc- 
turing elements where: FIGS. 4A-4B show a set of character 
ascender structuring elements of height 3 and length 5, 
where the solid dots are ON pixels along the bottom row and 
along one side column and there are one or more OFF pixels 
in a remaining location preferably separated from the ON 
pixels; FIGS. 4C-4D show a set of character ascender 
structuring elements of height 4 and length 5; and FIGS. 
4&-4P show a set of character ascender structuring elements 
of height 5 and length 5. 

FIGS. 5A-5F show three sets of character descender 
structuring elements where: FIGS. 5A-5B show a set of 
character descender structuring elements of height 3 and 
length 5; FIGS. 5C-5D show a set of character descender 
structuring elements of height 4 and length 5; and FIGS. 
5E-5F show a set of character descender structuring ele- 
ments of height 5 and length 5; 

FIG. 6 shows a horizontal structuring element of length 5; 

FIG. 7 shows a block system diagram of the arrangement 
of system components forming a word shape recognition 
system; 

FIG. 8 shows a block system diagram for identifying 
equivalence classes of image units; and 

FIG. 9 shows a block system diagram for identifying 
significant image units. 

FIG. 10 shows an image sample of example text over 
which the inventive process will be demonstrated; 

FIG. 11 is a copy of a scanned image of the example text; 

FIGS. 12A, 12B and 12C graphically illustrate the process 
used to determine the angle at which the example text is 
oriented in the image sample prior for further processing, 
while FIG. 12D shows graphs of the responses taken from 
the example text, which are used to determine the angle at 
which the example text is oriented in the image sample prior 
to further processing; 
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FIGS. 13A and 13B respectively show the derivation and 
use of a graph examining the sample image of the example 
text to determine baselines of text within the image; 

FIGS. 14A and 14B are flowcharts illustrating the proce- 
dures executed to determine the baselines shown in FIG. 5 
13A; 

FIG. 15 shows the scanned image of the example text with 
baselines indicated thereon after derivation from the data 
shown in FIGS. 13A and 13B; 

FIG. 16 is a flowchart illustrating the steps used in the 
application of a median filter to the image of FIG. 10; 

FIG. 17 is an enlarged pictorial representation of a portion 
of the image of FIG. 10, illustrating the application of the 
median filter; 15 

FIG. 18 demonstrates the resulting image after application 
of a median filter, a process known herein as blobifying, to 
the scanned image of the example text, which tends to render 
character strings as a single set of connected pixels; 

FIG. 19 shows a subsequent step in the process, in which 2 ° 
lines of white pixels are added to the blurred image to clearly 
delineate a line of character strings from adjacent lines of 
character strings; 

FIG. 20 is a flowchart illustrating the steps required to add 
the white lines of FIG. 19; 

FIGS. 21A and 21B are flowcharts representing the pro- 
cedure which is followed to segment the image data in 
accordance with the blurred image of FIG. 18; 

FIG. 22 shows the sample text with bounding boxes 30 
placed around each word group in a manner which uniquely 
identifies a subset of image pixels containing each character 
string; 

FIGS. 23A and 23B illustrate derivation of a single 
independent value signal, using the example word "from", 35 
which appears in the sample image of example text; 

FIG. 24 illustrates the resulting contours formed by the 
derivation process illustrated in FIGS. 23A and 23B; 

FIG. 25 illustrates the steps associated with deriving the 
word shape signals; 

FIGS. 26A, 26B, 26C and 26D illustrate derivation of a 
single independent value signal, using the example word 
"from"; 

FIGS. 27 A, 27B, 27C and 27D illustrate derivation of a 45 
single independent value signal, using the example word 
"red", which does not appear in the sample image of 
example text; 

FIG. 28 shows a simple comparison of the signals derived 
for the words "red" and "from" using a signal normalization 50 
method; 

FIGS. 29A, 29B, and 29C illustrate the details of the 
discrepancy in font height, and the method for normalization 
of such discrepancies; 

FIG. 30 is a flowchart detailing the steps used for one 
method of determining the relative difference between word 
shape contours; 

FIG. 31 is a flowchart detailing the steps of a second 
method for deterrnining the relative difference between word fi£) 
shape contours; 

FIGS. 32A and 32B are respective illustrations of the 
relationship between the relative difference values calcu- 
lated and stored in an array, for both a non-slope-constrained 
and a slope-constrained comparison; and $5 

FIG. 33 is a block diagram of a preferred embodiment of 
an apparatus according to the invention for detecting func- 



40 



55 



tion words- in a scanned document image without first 
converting the document image to character codes; 

The Appendix contains source code listings for a series of 
image manipulation and signal processing routines which 
have been implemented to demonstrate the functionality of 
the present invention. Included in the Appendix are four 
sections which are organized as follows: 

Section A, beginning at page 1, comprises the declarative 
or "include" files which are commonly shared among the 
functional code modules; 

Section B, beginning at page 26, includes the listings for 
a series of library type functions used for management of the 
images, error reporting, argument parsing, etc.; 

Section C, beginning at page 42, comprises numerous 
variations of the word shape comparison code, and further 
includes code illustrating alternative comparison techniques 
than those specifically cited in the following description; 

Section D, beginning at page 145, comprises various 
functions for the word shape extraction operations that are 
further described in the following description. 

DETAILED DESCRIPTION OF THE 
PREFERRED EMBODIMENT 

In contrast to prior techniques, such as those described 
above, the invention is based upon the recognition that 
scanned image files and character code files exhibit impor- 
tant differences for image processing, especially in data 
retrieval. The method of a preferred embodiment of the 
invention capitalizes on the visual properties of text con- 
tained in paper documents, such as the presence or fre- 
quency of linguistic terms (such as words of importance like 
"important", "significant", "crucial", or the like) used by the 
author of the text to draw attention to a particular phrase or 
a region of the text; the structural placement within the 
document image of section titles and page headers, and the 
placement of graphics; and so on. A preferred embodiment 
of the method of the invention is illustrated in the flow chart 
of FIG. 1, and an apparatus for performing the method is 
shown in FIG. 2. For the sake of clarity, the invention will 
be described with reference to the processing of a single 
document However, it will be appreciated that the invention 
is applicable to the processing of a corpus of documents 
containing a plurality of documents. More particularly, the 
invention provides a method and apparatus for automatically 
excerpting semantically significant information from the 
data or text of a document based on certain morphological 
(structural) image characteristics of image units correspond- 
ing to units of understanding contained within the document 
image. The excerpted information can be used, among other 
things, to automatically create a document index or sum- 
mary. The selection of image units for summarization can be 
based on frequency of occurrence, or predetermined or user 
selected selection criteria, depending upon the particular 
application in which the method and apparatus of the 
invention is employed. 

The invention is not limited to systems utilizing document 
scanning. Rather, other systems such as a bitmap worksta- 
tion (i.e., a workstation with a bitmap display) or a system 
using both bitmapping and scanning would work equally 
well for the implementation of the methods and apparatus 
described herein. 

With reference first to FIG. 2, the method is performed on 
an electronic image of an original document 5, which may 
include lines of text 7, titles, drawings, figures 8, or the like, 
contained in one or more sheets or pages of paper 10 or other 
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tangible form. The electronic document image to be pro- 
cessed is created in any conventional manner, for example, 
by a conventional scanning means such as those incorpo- 
rated within a document copier or facsimile machine, a 
Braille reading machine, or by an electronic beam scanner or 
the like. Such scanning means are well known in the art, and 
thus are not described in detail herein. An output derived 
from the scanning is digitized to produce undecoded bit 
mapped image data representing the document image for 
each page of the document, which data is stored, for 
example, in a memory 15 of a special or general purpose 
digital computer data processing system 13. The data pro- 
cessing system 13 can be a data driven processing system 
which comprises sequential execution processing means 16 
for performing functions by executing program instructions 
in a predetermined sequence contained in a memory, such as 
the memory 15. The output from the data processing system 
13 is delivered to an output device 17, such as, for example, 
a memory or other form of storage unit; an output display 
17A as shown, which may be, for instance, a CRT display; 
a printer device 17B as shown, which may be incorporated 
in a document copier machine or a Braille or standard form 
printer; a facsimile machine, speech synthesizer or the like. 

Through use of equipment such as illustrated in FIG. 2, 
the identified word units are detected based on significant 25 
morphological image characteristics inherent in the image 
units, without first converting the scanned document image 
to character codes. 

The method by which such image unit identification may 
be performed is described with reference now to FIG. 1. Hie 30 
first phase of the image processing technique of the inven- 
tion involves a low level document image analysis in which 
the document image for each page is segmented into unde- 
coded information containing image units (step 20) using 
conventional image analysis techniques; or, in the case of 35 
text documents, preferably using the bounding box method 
described in copending U.S. patent application Sen No. 
07/794392 filed concurrently herewith by Huttenlocher and 
Hopcroft, and entitled "Method for Determining Boundaries 
of Words in Text." The locations of and spatial relationships 40 
between the image units on a page are then determined (step 
25). For example, an English language document image can 
be segmented into word image units based on the relative 
difference in spacing between characters within a word and 
the spacing between words. Sentence and paragraph bound- 45 
aries can be similarly ascertained. Additional region seg- 
mentation image analysis can be performed to generate a 
physical document structure description that divides page 
images into labelled regions corresponding to auxiliary 
document elements like figures, tables, footnotes and the 
like. Figure regions can be distinguished from text regions 
based on the relative lack of image units arranged in a line 
within the region, for example. Using this segmentation, 
knowledge of how the documents being processed are 
arranged (e.g., left-to-right, top-to-bottom), and, optionally, 
other inputted information such as document style, a "read- 
ing order" sequence for word images can also be generated. 
The term "image unit" is thus used herein to denote an 
identifiable segment of an image such as a number, charac- 
ter, glyph, symbol, word, phrase or other unit that can be 
reliably extracted. Advantageously, for purposes of docu- 
ment review and evaluation, the document image is seg- 
mented into sets of signs, symbols or other elements, such as 
words, which together form a single unit of understanding. 
Such single units of understanding are generally character- 
ized in an image as being separated by a spacing greater than 
that which separates the elements forming a unit, or by some 
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predetermined graphical emphasis, such as, for example, a 
surrounding box image or other graphical separator, which 
distinguishes one or more image units from other image 
units in the scanned document image. Such image units 
representing single units of understanding will be referred to 
hereinafter as "word units." 

Advantageously, a discrimination step 30 is next per- 
formed to identify the image units which have insufficient 
information content to be useful in evaluating the subject 
matter content of the document being processed. One pre- 
ferred method is to use the morphological function or stop 
word detection techniques disclosed in the copending U.S. 
patent application Ser. No. 07/794,190 filed concurrently 
herewith by D. Bloomberg et al., and entitled "Detecting 
Function Words Without Converting a Document to Char- 
acter Codes". 

The method of identification of image units which have 
insufficient information content by determining function 
words without converting the document to character codes is 
shown in FIG. 3. The following definitions are used to 
describe this method: 

A binary image contains pixels that are either ON or OFF. 
Binary images are manipulated according to a number of 
operations wherein one or more source images are mapped 
onto a destination image. Hie results of such operations are 
generally referred to as images. 

A morphological operation refers to an operation on a 
pixelmap image (a source image), that uses a local rule at 
each pixel to create another pixelmap image, the destination 
image. This rule depends both on the type of the desired 
operation to perform as well as on the chosen structuring 
element. 

A structuring element (SE) refers to an image object of 
typically (but not necessarily) small size and simple shape 
that probes the source image and extracts various types of 
information from it via the chosen morphological operation. 
FIGS. 4 and 5 show SEs where a solid circle is a hit, and an 
open circle is a miss. The center position is denoted by a 
cross. Squares that have neither solid nor open circles are 
"don't cares"; their value in the image (ON or OFF) is not 
probed. A binary SE is used to probe binary images in a 
binary morphological operation that operates on binary input 
images and creates an output binary image. The SE is 
defined by a center location and a number of pixel locations, 
each normally having a defined value (ON or OFF). The 
pixels defining the SE do not have to be adjacent each other. 
The center location need not be at the geometrical center of 
the pattern; indeed it need not even be inside the pattern, A 
solid SE refers to an SE having a periphery within which all 
pixels are ON. For example, a solid 2x2 SE is a 2x2 square 
of ON pixels. A solid SE need not be rectangular. A 
horizontal SE is generally one row of ON pixels and a 
vertical SE is generally one column of ON pixels of selected 
size. A hit-miss SE refers to an SE that specifies at least one 
ON pixel and at least one OFF pixel. 

AND, OR and XOR are logical operations carried out 
between two images on a pixel-by-pixel basis. 

NOT is a logical operation carried out on a single image 
on a pixel-by-pixel basis. 

EXPANSION is scale operation characterized by a scale 
factor N, wherein each pixel in a source image becomes an 
NxN square of pixels, all having the same value as the 
original pixel. 

REDUCTION is a scale operation characterized by a 
scale factor N in a threshold level M. REDUCTION with 
scale=N entails dividing the source image into NxN squares 
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of pixels, mapping each such square in the source image to 
a single pixel on the destination image. The value for the 
pixel in the destination image is determined by the threshold 
level M, which is a number between I and N 2 . If the number 
of ON pixels in the pixel square is greater or equal to M, the 5 
destination pixel is ON, otherwise it is OFF. 

EROSION is a morphological operation wherein a given 
pixel in the destination image is turned ON if and only if the 
result of superimposing the SE center on the corresponding 
pixel location in the source image results in a match between 10 
all ON and OFF pixels in the SE and the underlying pixels 
in the source image. An EROSION will give one pixel in the 
destination image for every match. That is, at each pixel, it 
outputs 1 if the SE (shifted and centered at that pixel) is 
totally contained inside the original image foreground, and 15 
outputs 0 otherwise. Note that EROSION usually refers to 
operations using a SE with only hits and more generally 
matching operations with both hits and misses (often called 
a hit-miss transform). The term EROSION is used herein to 
include matching operations with both hits and misses, thus 20 
the hit-miss transform is the particular type of EROSION 
used herein. 

DELATION is a morphological operation wherein a given 
pixel in the source image being ON causes the SE to be 
written into the destination image with the SE center at the 
corresponding location in the destination image. The SEs 
used for DILATION typically have no OFF pixels. The 
DILATION draws the SE as a set of pixels in the destination 
image for each pixel in the source image. Thus, the output 
image is the union of all shifted versions of the SE translated 
at all 1 -pixels of the original image. 

FillQip is a morphological operation where one image is 
used as a seed and is grown morphologically, clipping it at 
each growth step to the second image. For example, a 35 
fillClip could include a DILATION followed by logically 
ANDing the DILATION result with another image. 

OPENING is a morphological operation that uses an 
image and a structuring element and consists of an ERO- 
SION followed by a DILATION. The result is to replicate 40 
the structuring element in the destination image for each 
match in the source image. 

CLOSING is a morphological operation using an image 
and a structuring element. It includes a DILATION followed 
by an EROSION of the image by a structuring element. A 45 
CLOSE of an image is equivalent to the bit inverse of an 
OPEN on the (bit inverse) background. 

UNION is a bitwise OR between two images. An inter- 
section is a bitwise AND between two images. 

Blurring is a DILATION of an image by a structuring 50 
element(s) consisting of two or more hits, 

A mask refers to an image, normally derived from an 
original or source image, that contains substantially solid 
regions of ON pixels corresponding to regions of interest in 55 
the original image. The mask may also contain regions of 
ON pixels that do not correspond to regions of interest. 

The various operations defined above are sometimes 
referred to in noun, adjective, and verb forms. For example, 
references to DILATION (noun form) may be in terms of go 
DILATING the image or the image being DILATED (verb 
forms) or the image being subjected to a DILATION opera- 
tion (adjective form). No difference in meaning is intended. 

Morphological operations have several specific properties 
that simplify their use in the design of appropriate proce- 65 
dures. First, they are translationally invariant. A sideway 
shift of the image before transforming does not change the 



result, except to shift the result as well. Operations that are 
translationally invariant can be implemented with a high 
degree of parallelism, in that each point in the image is 
treated using the same rule. In addition, morphological 
operations satisfy two properties that make it easy to visu- 
alize their geometrical behavior. First, EROSION, DILA- 
TION, OPEN and CLOSE are increasing, which means that 
if image 1 is contained in image 2, then any of these 
morphological operations on image 1 will also be contained 
in the morphological operation on image 2. Second, a 
CLOSE is extensive and OPEN is antiextensive. This means 
that the original image is contained in the image transformed 
by CLOSE and the image transformed by OPEN is con- 
tained in the original image. The DILATION and EROSION 
operations are also extensive and anti-extensive, respec- 
tively, if the center of the structuring element is located on 
a hit. 

The OPEN and CLOSE operations also satisfy two more 
morphological properties: 

(1) The result of the operation is independent of the position 
of the center of the structuring element. 

(2) The operation is idempotent, which means that reapply- 
ing the OPEN or CLOSE to the resulting image will not 
change it. 

An image unit means an identifiable segment of an image 
such as a word, number, character, glyph or other units that 
can be extracted reliably and have an underlying linguistic 
structure. 

The term significant and its derivatives are used in this 
description to indicate the importance of particular charac- 
teristics of an image unit. An image unit with significant 
characteristics becomes a significant image unit in that it 
contains high value information which can be used for 
further processing of the document image. Significant char- 
acteristics of image units include a variety of classifiers such 
as length, width, location on a page of the document image, 
font, typeface and measurement by other parameters includ- 
ing, but not limited to: one or more cross-sections of a box 
(a cross-section being a sequence of ON or OFF pixels); a 
number of ascenders associated with an image unit; a 
number of descenders associated with an image unit; aver- 
age pixel density in an image unit; a length of a topline 
contour of an image unit, including peaks and troughs; a 
length of a base contouring of the image units, including 
peaks and troughs; and the location of image units with 
respect to neighboring image units, e.g., vertical position 
and horizontal inter-image unit spacing. 

Referring to FIG. 3, the method for detecting function 
words in a scanned document image without first converting 
the document image to character codes will be described. An 
image of a page of a document is scanned in step 302 and 
the image is segmented into image units in step 304 by using 
either a conventional image analysis techniques or by using 
first a technique to determine baselines of image units and 
then second a technique for providing bounding boxes 
around image units (see U.S. patent application Sen No. 
07/794,391 entitled "A Method of Deriving Wordshapes for 
Subsequent Comparison" by Huttenlocher et al.) 

In step 306, a length and height of each image unit in the 
image is determined. Short image units are determined in 
step 308 as image units of no more than a predetermined 
number of characters, preferably three characters or less in 
length. In step 310, image units which are not short image 
units are deleted from the image. In step 312, the image is 
blurred or smeared in a horizontal direction although the 
image units are not smeared together. This can be accom- 
plished for example by CLOSING the image with a hori- 
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zontal structuring element such as the structuring element of 
length 5 (i.e., 5 pixels) shown in FIG. 6. The length of the 
horizontal structuring element used to blur the x-height 
characters in the image is dependent upon the width of the 
character type being used. Furthermore, other configurations 
of structuring elements may be used in the CLOSING 
operation to obtain the same smearing effect. However, the 
most efficient and effective way to smear characters of 
x-height is to use a horizontal structuring element as 
described above. 

A UNION of erosions is taken in step 3 14 of the image by 
using a set of ascender matching structuring elements such 
as those shown in FIGS. 4A-4F, and a set of descender 
matching structuring elements such as those shown in FIGS. 
5A-5F. The UNION taken in step 314 provides optional 
noise elimination filtering, and the UNION will provide a 
seed from which to fill short image unit masks in a subse- 
quent seed filling operation such as the fillClip operation of 
. step 316. The UNION of step 314 acts on all image units 
remaining in the image (i.e., only short image units in this 
case) and since the UNION of erosions was taken using a set 
of ascender matching structuring elements and a set of 
descender matching structuring elements, the image units 
that will be filled are those containing ascender and/or 
descender characters, i.e., function words. The function 
words arc identified in step 318 as those image units which 
are filled short image unit masks. 

In step 320, a test occurs to determine whether a last page 
of the document has been scanned. If the last page has been 
scanned, then the method terminates at step 324, otherwise 
the page is incremented in step 322 and the incremented 
(next) page is scanned in step 302 whereupon the image 
(next page) is scanned and the previously described steps of 
the method are reiterated. Of course, all pages could be 
scanned and stored as bit map images in a memory prior to 
performing the function word identification procedures 
described above. Moreover, the image segmentation step 
can also be performed prior to performing this method and 
the segmented image stored in memory. 

This is only one preferred method to perform the dis- 40 
crimination step 30 of FIG. 1. Using this method, the image 
units which have insufficient information content to be 
useful in evaluating the subject matter content of the docu- 
. ment being processed are identified. 

Next, in step 40, selected image units, e.g., the image units 
not discriminated in step 30, are evaluated, without decoding 
the image units being classified or reference to decoded 
image data, based on an evaluation of predetermined mor- 
phological (structural) image characteristics of the image 
units. The evaluation entails a determination (step 41) of the 
image characteristics and a comparison (step 42) of the 
determined image characteristics for each image unit with 
the determined image characteristics of the other image 
units. 

One preferred method for defining the image unit image 
characteristics to be evaluated is to use the word shape 
derivation techniques disclosed in the copending U.S. patent 
application Ser. No. 07/794,391 filed concurrently herewith 
by D. Huttenlocher and M. Hopcroft, and entitled "A 
Method of Deriving Wordshapes for Subsequent Compari- 
son," Published European Application No. 0543594,' pub- 
lished May 26, 1993. As described in the aforesaid appli- 
cation, at least one, one-dimensional signal characterizing 
the shape of a word unit is derived; or an image function is 
derived defining a boundary enclosing the word unit, and the 
image function is augmented so that an edge function 
representing edges of the character string detected within the 
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boundary is defined over its entire domain by a single 
independent variable within the closed boundary, without 
individually detecting and/or identifying the character or 
characters making up the word unit. 

More specifically, the above reference discloses a method 
for deriving, defining, and comparing words in terms of their 
shapes. It will, of course, be recognized that each element of 
the system may be many devices, or may simply be a 
program operated within a single device. The method will be 
described with reference to FIG. 7. Beginning with an input 
bitmap 710, a bitmap of an image is initially directed to a 
segmentation system 712, in which words, or character 
strings, or other multi-character units of understanding, will 
be derived. Initially, the image bitmap passes through skew 
detector 714, which deterrnines the angle of orientation of 
text in the image. Using information about the orientation of 
the image, and the image itself, at text baseline processor 
716, toplines and baselines of the. text are determined, so that 
upper and lower boundaries of lines of text within the image 
are identified. 

At median filter 718, the function referred to as "blobify" 
is performed, which operates on the image so that each word 
group in a line may be treated as a single unit. As used 
herein, "word", "symbol string" or "character string" refers 
to a set of connected alphanumeric or punctuation elements, 
or more broadly, signs or symbols which together form a 
single unit of semantic understanding. It will be appreciated 
that these terms may also be used to refer to the images 
thereof. Such single units of understanding are characterized 
in an image as separated by a spacing greater than that which 
separates the elements, signs or symbols forming the unit. To 
the blobified image, a set of white lines are added at block 
720, to clearly separate adjacent lines of text. The white lines 
are based on baseline determinations provided by processor 
716. Using this information, i.e., the blobified words, which 
are clearly separated from adjacent words and words in 
adjacent lines, a bounding box is defined about the word at 
block 722, thereby identifying and enclosing the word. 

Thereafter word shape signal computer 724 derives a 
word shape signal representing the individual words in the 
image, based on the original image and the bounding box 
determinations. This information is then available for use at 
a word shape comparator 726, for comparing word shape 
signals representative of known words from a word shape 
dictionary 728, with the as yet unidentified word shape 
signals. In an alternative embodiment word shape compara- 
tor 726 may be used to compare two or more word shapes 
determined from image 710. More importantly, word shape 
comparator 726 is not limited to the comparison of word 
shapes from unrecognized strings of characters to known 
word shapes. In a simplified context, comparator 726 is 
merely an apparatus for comparing one word shape against 
another to produce a relative indication of the degree of 
similarity between the two shapes. 

In general, a method accomplishing this technique 
includes the following steps. Once orientation of the image 
is established and line spacing and word group spacing is 
established, each word can be surrounded by a bounding 
box. A reference line is then created extending through the 
character string image. The reference line may be a block 
having a finite thickness ranging from two-thirds of the x 
height to one-third of the x height, or in fact it may have a 
zero width. At the resolution of the image, the distance from 
the reference line to the upper edge of the text contour or 
bounding box is measured in a direction perpendicular to the 
reference line. Similarly, measurements may be made from 
the reference line to the lower bounding box edge or to the 
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text contour along the lower portion of the word, whichever 
is closer. Because the set of values derived computationally 
can be expressed in terms of position along the horizontal 
axis versus length, the signal can be considered a single 
independent variable or one dimensional signal. Either or 5 
both of these sets of values may be used to describe the word 
shape. Additionally, although possibly less desirable, it is 
well within the scope of this method to measure the distance 
of a perpendicular line drawn from the top of the bounding 
box or the bottom of the bounding box, to the first contact 10 
with the word or the reference line, as desired. 

With a system and process for word shape derivation 
given, the method may also be considered mathematically. 
Considering image data i(x,y), which in one common case 
could be an array of image data in the form of a bitmap, a 15 
character set is identified in one of many methods, perhaps 
as described above, which defines a boundary enclosing the 
selected symbol string within a subset of the array of image 
data. From i(x,y), an edge signal e(x,y), which represents the 
edges of i(x,y) detected within the closed boundary, is 20 
derived. The edge signal is augmented by adding additional 
data to i(x,y) so that e(x,y) is a signal e'(x,y) defined over its 
entire domain with respect to a single dimension or variable 
within the closed boundary. One, two, or more signals may 
be derived from e'(x,y) which are each one dimensional 25 
signals g'(t), where g is a function of parameter t which is a 
reference frame dependent parameter. 

It is important to realize that the mathematical process 
used for the derivation of the one dimensional signal is 
essentially reversible up to the information it contains, e.g., 30 
a bitmap may be reconstructed from the upper and lower 
bitmap contours. It will be noted that if the reference has a 
finite thickness and is therefore taken out of the image, that 
portion of the image is not identifiable, however, if it has a 
zero width the information still remains. 35 

A recognition dictionary, or look up table of word shapes, 
can clearly be created through use of the described process. 
Hie process can be operated on using either scanned words 
as the source of the information, or in fact, they can be 
computer generated for a more "perfect" dictionary. 40 

A detailed example using this method is disclosed in the 
U.S. patent application Ser. No. 07/794391. 

lb demonstrate the process of the invention, at FIG. 10, 
a sample image, taken from a public domain source is 
shown, having several lines of text contained therein. FIG. 45 
10 demonstrates approximately how the image would appear 
on the page of text, while FIG. 11, shows a scanned image 
of the page, which demonstrates an enlargement of the 
image of a bitmap that would present problems to known 
OCR methods. Looking at, for example, the image of the 50 
word 50a •'practitioner" in the first line of the text image, it 
may be seen that several of the letters run together. Also, at 
the lower right hand portion of the image, circled and 
numbered 52, noise is present. Looking at the word "prac- 
titioner' s", circled and numbered 54, the running together of 55 
a punctuation mark and a letter is further noted. 

With reference again to FIG. 7, in one possible embodi- 
ment of the invention, skew detector 714, may be imple- 
mented using a general method for determining the orien- 
tation of the text lines in the image. This method looks at a 60 
small number of randomly selected edge pixels (defined as 
a black pixel adjacent to at least one white pixel), and for 
each edge pixel considers, at FIG. 12A, a number of lines, 
56a, S6b, 56c being examples, extending from the pixel at 
evenly spaced angular increments over a specified range of 65 
angles. The edge pixels are selected randomly from the set 
of all image pixels by the function RandomEdgePixelQ 
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(Appendix, page 243). FIGS. 12A (see lines 56a f 56*, 56c), 
12B (see lines 58a, 58*, 58c) and 12C (see lines 60a, 60*, 
60c) represent a series of increasingly smaller angular 
ranges over which the above mentioned technique is applied 
to illustrative edge pixels to accurately determine the angu- 
lar orientation of the text within the image. Subsequent to 
finding edge pixels and defining the lines, skew detector 714 
traces the path of each line, determining the lengths, in 
pixels, of strings of successive black pixels which are 
intersected by the line. Upon reaching the image boundary, 
an average black pixel string length is calculated by sum- 
ming the lengths of the individual strings, and dividing the 
sum by the total number of distinct strings which were 
found. This operation is carried out for all the lines, thereby 
arriving at an average black pixel string length for each line 
extending from the selected edge pixel. These lengths are 
plotted on FIG. 12D as curve A, showing niinima at approxi- 
mately 0 and 3.14 radians. Curve A is a graphical represen- 
tation of the summation/averaging function over each of a 
series of angled lines extending from the edge pixel, and 
spread over a range from 0 to 2tc radians. Once a first 
minimum has been located, verification of the minimum (in 
the example, approximately 0 radians) is achieved by deter- 
mining whether a second rninimum exists at approximately 
rc radians from the first minimum. Upon verifying the 
existence of a second minima (in the example, approxi- 
mately 3.14 or rc radians), a coarse skew angle is identified. 
Subsequently, it is necessary to more closely determine the 
skew angle of the text. This is accomplished by utilizing a 
number of lines which extend from a randomly selected 
edge pixel, where the lines differ by smaller angular incre- 
ments, and the angular range is centered about the coarse 
skew angle. However, the fine skew angle may be deter- 
mined by analyzing the total number of black pixels con- 
tained along a predetermined length of the lines. More 
specifically, the number of pixels over a unit distance are 
plotted as curve B on FIG. 12D, and the fine skew angle is 
determined by identifying the rnaxima of the curve. In other 
words, the point of the curve where the highest concentra- 
tion of black pixels per unit line length exists, more accu- 
rately represents the angle of the text lines in the image. As 
shown by curve B, this results in a fine skew angle of 
approximately 0 radians, where the line intersects with the 
most black pixels along its length, and therefore is repre- 
sentative of the closest angle of orientation that needs to be 
determined. 

Alternatively, the skew angle may be determined as 
indicated by the NewFine() function (Appendix, page 245), 
which determines the skew angle using multiple iterations of 
the procedure described with respect to the fine angle 
determination. As indicated by FIGS. 12A, 12B, and 12C, 
each iteration would also use lines covering an increasingly 
smaller angular range, until a desired skew angle accuracy 
is reached. In the implementation illustrated by FIGS. 12A, 
12B, and 12C, the desired accuracy is achieved by a series 
of three iterations, each using a series of 180 distinct angles 
about the selected edge pixel. 

In the next process step, illustrated in the graphs of FIG. 
13A and FIG. 13B, text baseline processor 716 identifies the 
characteristic lines, upper topline and lower baseline, of 
each line of text. The process steps executed by text baseline 
processor 716 are illustrated in detail in FIGS. 14 A and 14B. 
The histogram of FIG. 13A, shown to the left along the 
image, is derived by examining lines, at the resolution of the 
image, and oriented parallel to the skew orientation of the 
image, as defined by the previously determined skew angle. 
These parallel lines spanning the image are used to deter- 
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mine the number of black pixels intersected by each of the 
lines. Along lines passing through inter text line spaces, no 
black pixels should be intercepted, while along lines through 
the text, large numbers of black pixels should be intercepted. 

More specifically, the function BaseLines(), (Appendix 5 
page 160), first finds the coordinates of a "main" line, block 
142, constructed through the center of the image and per- 
pendicular to the text lines, as determined by the skew angle 
passed to the function as shown by block 140. Next, Line 
Engine Procedure 144 is executed, where by proceeding 10 
along the main line from one end to the other, at a series of 
points along the main line, perpendicular branch lines are 
constructed which extend outwardly from the main line for 
a fixed distance, block 146. Along the branch lines, the 
number of black vertical edge pixels are counted, block 148, is 
and the number of black pixels intersected by the lines are 
counted, block 150, and summed for the opposing pain of 
lines, block 152. Black vertical edge pixels, as counted by 
block 148, are defined as black pixels having a white 
neighboring pixel at either the upper or lower neighboring 20 
pixel position. LineEngineOprocedure 144 is repeated until 
all points, and associated branch lines, along the main line 
have been processed, as determined by decision block 154. 
An x-height value may be returned from this procedure, 
which will subsequently be used by the word shape com- 25 
puter 724. 

Subsequently, the counts for all the branch lines are 
analyzed to determine the branch line pairs having the 
highest ratio of black vertical edge pixels to black pixels. In 
general, those lines having the highest percentages would 30 
correspond to lines passing along the upper and lower edges 
of the characters which form the text lines. As illustrated in 
the enlarged view of FIG. 13B, a definite distinction exists 
between those branch lines having a high vertical edge pixel 
ratio, line 82, and those having a low ratio, line 84. Appli- 35 
cation of a filter mask and comparison of the maximum 
peaks within the mask enables the identification of those 
lines which represent the text toplines and baselines, for 
example, line 82. The process is implemented in the max- 
Filterx module, beginning at line 57, the code for which is 40 
also incorporated in the newBaselines.c module at line 274, 
page 214. Baseline determination is described in further 
detail in a copending U.S. patent application, for a "Method 
for Determining Boundaries of Words in Text", Hutten- 
locher et al., U.S. patent application Ser. No. 07/794,392, 45 
which has been previously incorporated herein by reference. 
An additional test may also be applied to the histogram 
operation of step 150. This added test, a boolean test, may 
be used to assure that a minimum run of black pixels was 
detected during the analysis of the line. For example, a flag, 50 
which is cleared at the start of each branch line analysis, may 
be set whenever a series of five sequential black pixels are 
detected along the line. This test would assure that small 
noise or image artifacts are not recognized as baselines due 
to a high vertical edge pixel ratio. 55 

As an alternative method, it is possible to utilize the total 
number of black pixels lying along the branch lines to 
determine the locations of the baselines. Using histogram 
curve BL, which represents the number of black pixels 
counted along the branch lines, it is possible to determine 60 
which branch lines have the most black pixel intersections. 
Applying a threshold of the maximum allows the determi- 
nation of the upper and lower characteristic line pairs for 
each text line. Hence, the rising and falling portions of the 
histogram curve BL, constitute the characteristic lines of the 65 
text, and the threshold would be used to specifically identify 
the localized maxima surrounding an intervening minima, 
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thereby enabling identification of the baseline positions 
which would be used for further processing. More impor- 
tantly, this alternative approach, illustrated as step 162, may 
be utilized to identify the upper and lower baselines of a 
baseline pair, based upon the slope of the BL histogram 
curve. It is important to note that there is little additional 
processing associated with the identification step as the 
histogram information was collected previously during step 
150. Once the preliminary characteristic line or baseline 
pairs are identified, block 162, a verification step, block 164, 
is executed to verify that the baseline pairs are separated by 
more than a minimum distance, the minimum distance being 
established by calculating the average line pair separation 
for all line pairs in the image. After verification, the valid 
baseline information is stored by output block 166 for later 
use by the white line addition and segmentation blocks, 18 
and 720, respectively. 

An important advantage of these baseline determination 
methods, are that they are highly insensitive to noise or 
extraneous marks in the interline space. FIG. 15 shows the 
result of the baseline determination on the example image of 
the sample text, showing that baseline pair, baseline and 
topline B n and B n ', respectively, have been located on the 
image, indicating those portions of the image in which a 
predominant portion of the text occurs. While some portions 
of the character ascender strokes are outside the baselines, 
no detriment to the remainder of the process is noted. Of 
course, a smaller threshold value might enable the system to 
capture more of the ascending strokes. 

With reference again to FIG. 7 in conjunction with FIGS. 
16 and 17, the next process step is a word group isolation 
step. A filter 718 is applied to a copy of the image which 
results in an image that tends to render the word into blobs 
distinguishable from one another. The filter is applied with 
a small window, to each area, to render as black those areas 
that are partly black. As shown in FIG. 16, the blobify 
function (Appendix page 165) first initializes mask variables 
which establish the mask size and angle, block 180, and then 
processes the upper scanline to initialize the data array, 
block 182. Median filtering is accomplished by sequentially 
moving the mask window through the image, blocks 184 and 
186, and whenever the number of black pixels appearing in 
the window exceeds a threshold value, the target pixel, about 
which the window is located, is set to black. FIG. 17, which 
illustrates some examples of the filter process, has a mask 
window 200 placed over a portion of the image. For 
example, with a twenty percent threshold and a generally 
rectangular mask having twenty-one pixels, arranged at an 
angel approximately equal to the skew determined for the 
text, the result of filtering in window 200 would be the 
setting of pixel 204 to black. Similarly, window 206, which 
primarily lies within the intercharacter spacing between the 
pixel representations of the letters V and "o", would cause 
pixel 208 to be set to black. On the other hand, window 210, 
which lies in the region between word groups, would not 
have a sufficient number of black pixels present within the 
window to cause pixel 212 to be set to black. The size, shape 
and orientation of mask window 200 is optimized to reduce 
the filling in between text lines, while maximizing the fill 
between letters common to a single word. 

As illustrated by FIG. 18, the result of the median filtering 
is that the relatively small spacing between characters in a 
word generally becomes inconsequential, and is filled with 
black pixels. Words become a single connected set of pixels, 
i.e., no white spaces completely separate characters in a 
single word. However, the relatively large spacing between 
character strings or between words, is a larger space outside 
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of the ability of the filter to turn into black, and therefore 
serves to distinguish adjacent symbol strings. With reference 
now to FIGS. 15 and 18, it can be seen that the first two 
words of the sample text, "A" and "practitioner" have been 
"blobified", as this process is referred to, so that, for 5 
example, the "p" of "practitioner" is no longer separated 
from the "r" of that word. (Compare, FIG. 11). Once again, 
despite the blobifying or blurring of characters, "A" and 
"practitioner" remain as discrete blobs of connected sym- 
bols, or words. 10 

With reference again to FIG. 7, as an adjunct to this step, 
white line addition 720, superimposes upon the blobified 
image of FIG. 12 a series of white pixel lines to make certain 
that lines of text are maintained separately from adjacent 
lines of text (i.e., no overlapping of the filtered text lines), 
With reference to FIGS. 18 and 19, noting the circled areas 15 
258 and 258', a combination of an ascender and descender 
has resulted in an interline merging of two words. The text 
line overlap illustrated in area 258 of FIG, 18 is exactly what 
is eliminated by superimposing the white lines on the 
blobified or filtered image. 20 

This superposition of white lines operation, the outcome 
of which is illustrated by FIG. 19, is carried out by the 
process illustrated in FIG. 20 as executed in the Draw- 
MiddleLinesO function (Appendix page 233). Generally, 
white lines WL are added to the image, approximately 25 
halfway between adjacent baseline and topline pairs, to 
assure that there is no cross-text line blobifying. Once again, 
FIG. 19 shows the result of white line addition to the 
blobified image of FIG. 18. 

Referring now to FIG. 20, white line addition block 720 
begins by initializing variables in step 280 and subsequently 
reads in the topline location from the baseline information of 
the first text line. The topline information is discarded, block 
282, and the next baseline and topline locations are popped 
from the storage stack or list, blocks 284 and 286, respec- 
tively. With respect to the image, this baseline-topline pair 35 
respectively represents the bottom and top of adjacent text 
lines. Next, at step 288, the point lying at the center of the 
pair is located to provide a starting point for the white lines 
which are drawn from the center of the image in an outward 
direction. The endpoints of the white lines are calculated in 40 
step 290, using the skew angle determined by skew detector 
714 of FIG. 7. White lines are drawn or superimposed on the 
blobified image at step 292, and the process is continued 
until all text lines have been effectively separated, as con- 
trolled by test block 294. 45 

With reference again to FIG. 7, as a result of the blobify 
or median filtering, the position of bounding boxes about 
each connected set of pixels formed in the blobify step may 
be determined. Bounding boxes are placed only about those, 
connected components or words that are in a text line lying 50 
between the superimposed white lines. The bounding boxes 
are placed at the orientation of the text line, by identifying 
the extreme points of each group of connected pixels in the 
direction of the text line, and in the direction orthogonal to 
the text line, as opposed to the image coordinate system. 55 
This operation is performed by the function FindBordersO. 
(Appendix, page 172). Generally, the FindBorders function 
steps through all pixels within the image to find the bound- 
ing boxes of the connected characters (Paint Component), to 
determine the coordinates of the upper left comer of each 60 
box, as well as the length and width of the box. 

Referring now to FIGS. 21A and 21B, which detail the 
FindBordersO procedure, segmentation step. 1022 begins by 
placing a white border completely around the filtered image, 
step 1300. This is done to avoid running outside the edge of 63 
the array of image pixels. Next, pixel and line counters, x 
and y, respectively, are initialized to the first pixel location 



inside the border. Calling the ReadPixel procedure, block 
1304, the pixel color (black or white) is returned and tested 
in block 1306. If the pixel is white, no further processing is 
necessary and processing would continue at block 1322. 
Otherwise, the PaintComponentO procedure (Appendix, 
page 171) is called and begins by storing the location of the 
black pixel in a queue, block 1308. Subsequently, in a copy 
of the image, the pixel is set to white and the boundaries of 
the box, surrounding the connected pixels or components, 
are updated, blocks 1310 and 1312, respectively. Next, 
adjoining black pixels are set to white, block 1314, and the 
locations of the black pixels are added to the end of the 
queue, block 1316. At block 1318 the queue pointers are 
tested to determine if the queue is empty. If not empty, the 
next pixel in the queue is retrieved, block 1320, and pro- 
cessing continues at block 1312. Otherwise, if the queue is 
empty, all of the connected black pixels will have been set 
to white and the box boundaries will reflect a box which 
encompasses the connected components. Subsequently, the 
boundaries of the box which encompasses the word segment 
are verified and may be adjusted to an orthogonal coordinate 
system oriented with respect to the skew of the text lines, 
block 1322. 

It will no doubt be apparent here that while finding each 
text line is an integral part of the described method, and 
serves to make the present embodiment more robust, other 
methods of deriving the information acquired by that step 
are possible. The primary use of the text line finding function 
is a) to determine x-height, and b) define the white line 
addition, for separating interline blobs. Certainly this step 
may be removed, with a sacrifice in robustness, or other 
means of deriving the necessary information may be avail- 
able. 

The looping process continues at block 1324 which 
checks pixel counter x to determine if the end of the scanline 
has been reached, and if not, increments the counter at block 
1326 before continuing the process at block 1304. If the end 
of the scanline has been reached, pixel counter x is reset and 
scanline counter y is incremented at block 1328. Subse- 
quently, block 1330 checks the value of scanline counter y 
to determine if the entire image has been processed. If so, 
processing is completed. Otherwise, processing continues at 
block 1304 for the first pixel in the new scanline. 

Thus, as shown in FIG. 22, for the word "practitioner" the 
extremities of the connected character image define the 
bounding box. Once bounding boxes have been established, 
it is then possible at this step, to eliminate noise marks from 
further consideration. Noise marks are determined: 1) if a 
bounding box comer is outside the array of image pixels 
(Appendix, page 171); 2) if a box spans multiple text lines 
in the array (Appendix 229), or lies completely outside a text 
line; 3) if boxes are too small compared to a reference e, in 
either or both longitudinal or latitudinal directions, and 
accordingly are discarded. Noise marks 70a and 72 and 
others will not be considered words. The OnABaselineO 
function (Appendix, page 229) is an example of a function 
used to eliminate those boxes lying outside of the baseline 
boundaries. 

With reference to FIG. 7, at word shape computer 724, a 
signal representing the image of a word, or at least a portion 
thereof, now isolated from its neighbors, is derived. The 
derived signal is referred to as a word shape contour. The 
shape contour for each word is determined using the 
MakeShellO function (Appendix, page 228). As illustrated 
in FIG. 23A, this function first moves along the top of each 
bounding box, and starting with each pixel location along 
the top of the box, scans downward relative to the page 
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orientation, until either a black pixel, or the bottom of the 
box, is reached. A record of the set of distances d between 
the top of the box and the black pixel or box bottom is 
maintained. The set of distances d, accumulated over the 
length of the box, constitutes the top raw contour of the word 5 
shape. Subsequently, a bottom raw contour is produced in a 
similar manner as illustrated in FIG. 23B, for the same word 
depicted in FIG. 23A, by sequentially moving across the 
bottom of the box, and looking in an upwards direction, for 
either the first black pixel or the top of the bounding box. 

With reference now to FIG. 25, at block 100 which 
preferably operates on the actual image as opposed to the 
filtered image, which could be used in this step, one or more 
reference lines are established through each word. In other 15 
terms, the data representing the symbol string is augmented, 
so that it is defined over the range of the symbol string. In 
one embodiment, a blackout bar, which may have a finite 
thickness or a zero thickness is constructed through the 
word, preferably having an upper limit or reference line at 20 
approximately two thirds of the x height, and a lower limit 
or reference line at approximately one-third of the x height 
(which was determined at the baseline determination step). 
At contour calculation 102, a set of measurements is 
derived, for the distance d between the upper or lower edge 25 
of the bounding box, and the word, or the nearer of the 
reference line's closer edge of the black out bar. Hie 
calculation' s measurements are made at the resolution of the 
image. With reference to FIG. 26A, where the calculation's 
measurements are illustrated pictorially, it can be seen that 30 
the reference lines serve to allow the signal that will ulti- 
mately be derived from this step to be defined at every 
sampling position over the length of the word. In a preferred 
embodiment, the calculation's measurements of d are actu- 
ally generated from the contour data derived in accordance 35 
with FIGS. 23A, 23B previously collected, and are adjusted 
to limit the distance d with either the upper or lower edge of 
the blackout bar as indicated. In the embodiment shown in 
FIG. 26A, measurements are made from the upper line of the . 
bounding box to the upper reference line of the word, 40 
although this is not a requirement. Thus, for example, the 
measurement could alternatively be made from the reference 
line to either the upper or lower bounding line, or the 
character. FIG. 26B better shows how the set of measure- 
ments is used to form the signal output from block 104. The 45 
contour is represented as a set of measurements distance d', 
relative to the reference line. Measurement d' is therefore 
derived from the measurements shown in FIG. 26A, which 
designate the stopping point of line d, and the known 
position of the black out bar. Calculating the distance 50 
relative to the reference line enables scaling of the word 
shape contours to a common x height, thereby facilitating 
any subsequent comparison of the shapes. Accordingly, the 
distances d' represent a measurement from the reference line 
or blackout bar to the outer limits of the letter, and in the 55 
absence of a letter, provide a zero measurement These 
measurement might be derived directly, but the proposed 
indirect methods appear easier to implement FIGS. 26C and 
26D show that the sets of d' values can be plotted on a graph 
to form a one dimensional signal or curve representing the 60 
word shape. Details of the contour determination are con- 
tained in the function StoreOutlinePairO beginning in the 
Appendix at page 255. FIG. 24 is an image of the contour 
locations as established for the text sample of FIG. 10. It is 
important to note the informational content of FIG. 24, 65 
where, for the most part, it is relatively easy to recognize the 
words within the passage by their contours alone. 



In studies of the information delivered by the appearance 
of English language words, it has been determined that in a 
majority of cases, words can be identified by viewing only 
approximately the top third of the image of the word. In 
other words, the upper portion of the word carries with it 
much of the information needed for identification thereof. In 
a significant portion of the remainder of cases, words that are 
unidentifiable by only the upper third of the image of the 
word, become identifiable when the identification effort 
includes the information carried by the lower third of the 
image of the word. A relatively small class of words requires 
information about the middle third of the word before 
identification can be made. It can thus be seen that a stepwise 
process might be used, which first will derive the upper word 
shape signal or contour, second will derive the lower word 
shape signal or contour, and thirdly derive a word shape 
signal central contour (from the reference line towards the 
word or bounding box), in a prioritized examination of word 
shape, as required. In the examples of FIG. 26 A, 26B, and 
26C, the word "from" is fairly uniquely identifiable from its 
upper portion only. In the examples of FIG. 27 A, 27B, 27C 
and 27D, the word "red" is less uniquely identifiable from its 
upper portion, since it may be easily confused with the word 
"rod", and perhaps the word "rad". While the lower portion 
of the letter "a" may distinguish "red" and "tad", it is 
doubtful that the lower portion of the letter "o" will distin- 
guish the words "red" from "rod". However, the central 
portions of "red", "rad", and "rod" are quite distinct 

The determined morphological image characteristic(s) or 
derived image unit shape representations of each selected 
image unit are compared, as noted above (step 42), either 
with the determined morphological image characteristic(s) 
or derived image unit shape representations of the other 
selected image units (step 42A), or with predetermined/user- 
selected image characteristics to locate specific types of 
image units (step 42B). The tetermined morphological 
image characteristics of the selected image units are advan- 
tageously compared with each other for the purpose of 
identifying equivalence classes of image units such that each 
equivalence class contains most or all of the instances of a 
given image unit in the document, and the relative frequen- 
cies with which image units occur in a document can be 
determined, as is set forth more fully in the copending U.S. 
patent application Ser. No. 07/795,173 filed concurrently 
herewith by Cass et al., and entitled "Method and Apparatus 
for Determining the Frequency of Words in a Document 
without Document Image Decoding." Image units can then 
be classified or identified as significant according the fre- 
quency of their occurrence, as well as other characteristics 
of the image units, such as their length. For example, it has 
been recognized that a useful combination of selection 
criteria for business communications written in English is to 
select the medium frequency word units. 

The method for determining the frequency of words 
without decoding the document is shown in FIG. 8. The 
image is segmented into undecoded information containing 
image units (step 820) by using the method described above 
or by finding word boxes. Word boxes are found by closing 
the image with a horizontal SE that joins characters but not 
words, followed by an operation that labels the bounding 
boxes of the connected image components (which in this 
case are words). The process can be greatly accelerated by 
using 1 or more threshold reductions (with threshold value 
1), that have the effect both of reducing the image and of 
closing the spacing between the characters. The threshold 
reduction(s) are typically followed by a closing with a small 
horizontal SE. The connected component labeling operation 
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is also done at the reduced scale, and the results are scaled 
up to full size. The disadvantage of operating at reduced 
scale is that the word bounding boxes are only approximate; 
however, for many applications the accuracy is sufficient. 
The described method works fairly well for arbitrary text 5 
fonts, but in extreme cases, such as large fixed width fonts 
that have large' inter-character separation or small variable 
width fonts that have small inter-word separation, mistakes 
can occur. The most robust method chooses a SE for closing 
based on a measurement of specific image characteristics. 10 
This requires adding the following two steps: 

(1) Order the image components in the original or reduced 
(but not closed) image in line order, left to right and top 
to bottom. 

(2) Build a histogram of the horizontal inter-component 15 
spacing. This histogram should naturally divide into the 
small inter-character spacing and the larger inter-word 
spacings. Then use the valley between these peaks to 
determine the size of SE to use for closing the image to 
merge characters but not join words. 20 
After the bounding boxes or word boxes have been 

determined, locations of and spatial relationships between 
the image units on a page are determined (step 825). For 
example, an English language document image can be 
segmented into word image units based on the relative 25 
difference in spacing between characters within a word and 
the spacing between words. Sentence and paragraph bound- 
aries can be. similarly ascertained. Additional region seg- 
mentation image analysis can be performed to generate a 
physical document structure description that divides page 30 
images into labelled regions corresponding to auxiliary 
document elements like figures, tables, footnotes and the 
like. Figure regions can be distinguished from text regions 
based on the relative lack of image units arranged in a line 
within the region, for example. Using this segmentation, 35 
knowledge of how the documents being processed are 
arranged (e.g., left-to-right, top-to-bottom), and, optionally, 
other inputted information such as document style, a ''read- 
ing order" sequence for word images can also be generated. 
The term "image unit" is thus used herein to denote an 40 
identifiable segment of an image such as a number, charac- 
ter, glyph, symbol, word, phrase or other unit that can be 
reliably extracted. 

Advantageously, for purposes of document review and 
evaluation, the document image is segmented into sets of 45 
signs,' symbols or other elements, such as words, which 
together form a single unit of understanding. Such single 
units of understanding are generally characterized in an 
image as being separated by a spacing greater than that 
which separates the elements forming a unit, or by some so 
predeterrnined graphical emphasis, such as, for example, a 
surrounding box image or other graphical separator, which 
distinguishes one or more image units from other image 
units in the document image. Such image units representing 
single units of understanding will be referred to hereinafter 55 
as "word units." 

A Discrimination step 830 is next performed to identify 
the image units which have insufficient information content 
to be useful in evaluating the subject matter content of the 
document being processed by using the technique described 60 
above. 

Next, in step 840, selected image units, e.g., the image 
units not ^criminated in step 830, are evaluated, without 
decoding the image units being classified or reference to 
decoded image data, based on an evaluation of predeter- 65 
mined image characteristics of the image units. The evalu- 
ation entails a determination (step 841) of the image char- 
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acteristics and a comparison (step 842) of the determined 
image characteristics for each image unit with the deter- 
mined image characteristics of the other image units. 

One preferred method for defining the image unit mor- 
phological image characteristics to be evaluated is to use the 
word shape derivation techniques previously discussed. At 
least one, one-dimensional signal characterizing the shape of 
a word unit is derived; or an image function is derived 
defining a boundary enclosing the word unit, and the image 
function is augmented so that an edge function representing 
edges of the character string detected within the boundary is 
defined over its entire domain by a single independent 
variable within the closed boundary, without individually 
detecting and/or identifying the character or characters mak- 
ing up the word unit. 

The detennined image characteristic(s), e.g., the derived 
image unit shape representations of each selected image unit 
are compared, as noted above (step 841), with the deter- 
mined image characteristic(s)/derived image unit shape rep- 
resentations of the other selected image units for the purpose 
of identifying equivalence classes of image units (step 850), 
such that each equivalence class contains most or all of the 
instances of a given word in the document The equivalence 
classes are thus formed by clustering the image units in the 
document based on the similarity of image unit classifiers, 
without actually decoding the contents of the image units, 
such as by conversion of the word images to character codes 
or other higher-level interpretation. Any of a number of 
different methods of comparison can be used. One technique 
that can be used, for example, is by correlating the raster 
images of the extracted image units using decision networks, 
such technique being described for characters in a Research 
Report entitled "Unsupervised Construction of Decision 
networks for Pattern Classification" by Casey et al., IBM 
Research Report, 1984, herein incorporated in its entirety. 

Depending on the particular application, and the relative 
importance of processing speed versus accuracy, for 
example, comparisons of different degrees of precision can 
be performed. For example, useful comparisons can be 
based on length, width or some other measurement dimen- 
sion of the image unit (or derived image unit shape repre- 
sentation e.g., the largest figure in a document image); the 
location of the image unit in the document (including any 
selected figure or paragraph of a document image, e.g., 
headings, initial figures, one or more paragraphs or figures), 
font, typeface, cross-section (a cross-section being a 
sequence of pixels of similar state in an image unit); the 
number of ascenders; the number of descenders; the average 
pixel density; the length of a top line contour, including 
peaks and troughs; the length of a base contour, including 
peaks and troughs; and combinations of such classifiers. 

One way in which the image units can be conveniently 
compared and classified into equivalence classes is by 
comparing each image unit or image unit shape representa- 
tion when it is formed with previously processed image 
units/shape representations, and if a match is obtained, the 
associated image unit is identified with the matching equiva- 
lence class. This can be done, for example, by providing a 
signal indicating a match and incrementing a counter or a 
register associated with the notching equivalence class. If 
the present image unit does not match with any previously 
processed image unit, then a new equivalence class is 
created for the present image unit 

Alternatively, as shown (step 50) the image units in each 
equivalence class can be linked together, and mapped to an 
equivalence class label that is determined for each equiva- 
lence class. The number of entries for each equivalence class 
can then be merely counted. 
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Thus, after the entire document image, or a portion of 
interest, has been processed, a number of equivalence 
classes will have been identified, each having an associated 
number indicting the number of times , a image unit was 
identified having similar morphological characteristics, or 5 
classifiers, thus determining the image unit frequency. 

It will also be appreciated that the selection process can be 
extended to phrases comprising identified significant image 
units and adjacent image units linked together in reading 
order sequence. The frequency of occurrence of such 10 
phrases can also be determined such that the portions of the 
source document which are selected for summarization 
correspond with phrases exceeding a predetermined fre- 
quency threshold, e.g., five occurrences. A preferred method 
for determining phrase frequency through image analysis 15 
without document decoding is disclosed in copending U.S. 
patent application Sen No. 07/774,555 filed concurrently 
herewith by Withgott et al., and entitled "Method and 
Apparatus for Detenmning the Frequency of Phrases in a 
Document Without Document Image Decoding." 20 

It will be appreciated that the specification of the image 
characteristics for titles, headings, captions, linguistic crite- 
ria or other significance indicating features of a document 
image can be predetermined and selected by the user to 
determine the selection criteria defining a "significant" 25 
image unit For example, titles are usually set off above 
names or paragraphs in boldface or italic typeface, or are in 
larger font than the main text. A related convention for titles 
is the use of a special location on the page for information 
such as the main tide or headers. Comparing the image 30 
characteristics of the selected image units of the document 
image for matches with the image characteristics associated 
with the selection criteria, or otherwise recognizing those 
image units having the specified image characteristics per- 
mits the significant image units to be readily identified 35 
without any document decoding. 

Any of a number of different methods of comparison can 
be used. One technique that can be used, for example, is by 
correlating the raster images of the extracted image units 
using decision networks, such technique being described in 40 
a Research Report entitled "Unsupervised Construction of 
Decision networks for Pattern Classification" by Casey et 
al., IBM Research Report, 1984, herein incorporated in its 
entirety. 

Preferred techniques that can be used to identify equiva- 45 
lence classes of word units are the word shape comparison 
techniques disclosed in U.S. patent application Scr. Nos. 
07/796,119 and 07/795,169, filed concurrently herewith by 
Huttenlocher and Hopcroft, and by Huttenlocher, Hopcroft 
and Wayner, respectively, and entided, respectively, "Opti- 50 
cal Word Recognition By Examination of Word Shape," 
Published European Application No. 0543592, published 
May 26, 1993, and "Method for Comparing Word Shapes." 

For example, U.S. patent application Ser. No. 07/795,169 
discloses, with reference to FIG. 7, one manner in which a 55 
comparison is performed at word shape comparator 726. In 
one embodiment, the comparison is actually several small 
steps, each of which will be described. With reference to 
FIG. 28, generally, the two word shape signals, one a known 
word, the other for an unknown string of characters are 60 
compared to find out whether they arc similar. However, in 
this case, signal R is the upper contour of the word "red", 
while signal F is the upper contour of the word 'Irom". 
Actually, relatively few signals could be expected to be 
exactly identical, given typical distinctions between charac- 65 
ter fonts, reproduction methods, and scanned image quality. 
However, the word shape signals to be compared may be 
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scaled with respect to one another, so that they have the 
same x-hcights. This is achieved by determining the x-height 
of the pair of word shape contours to be compared. Once 
determined, the ratios of the x-heights are used to determine 
a scale factor to be applied to one of the contours. As the 
x-height is a characteristic measurement for fonts, it is used 
to determine the scaling factor in both the horizontal and 
vertical directions. An example of the scaling operation is 
found in the fontNoirm.c file beginning at line 172, where the 
StoreOutlinePairO function carries out the scaling operation 
in both the x and y, horizontal and vertical, directions. 
Alternatively, the shape signals may be compared without 
normalization and a weighing factor imposed upon the 
portion of the measured difference due to the unequal 
lengths. Furthermore, the amplitude or height of the signals 
has been normalized to further reduce the impact of the font 
size on the word shape comparison. 

Referring next to FIGS. 29A-29C, which illustrate details 
of the ascender/descender normalization operation, each of 
the shape signals are normalized based upon a common 
relationship between the ascender and descender heights and 
the x-height of the text characters. As illustrated, the actual 
ascender heights of characters printed with supposedly simi- 
lar font size, or what is now an appropriately scaled font 
size, may be slightly different This occurs as a result of type 
faces or fonts which are small on body or large on body, 
implying that similar characters exhibit variations in height 
across fonts that are the same size, for example 24 point 
fonts. As an illustration, distance dj in FIG. 29A represents 
the difference in ascender height for two occurrences of the 
letter "h." Likewise, distance d 2 illustrates a similar differ- 
ence between the heights of the letter "f * in FIG. 29B. As 
illustrated in FIG. 29C, the typical character may be broken 
into three sections, ascender portion 390, x-height portion 
392, and descender portion 394. In addition, the relative 
heights of these sections are illustrated as c, a, and b, 
respectively. Again, the normalization operation applied to 
the shape contours is found in the fontNormx module, 
beginning at page 183 of the Appendix. Applying the 
operations described with respect to StoreOutlinePairO 
function, page 255 of the Appendix, the areas of the contour 
lying above the x-height are scaled as follows: 

Similarly, the descenders are scaled by the following equa- 
tion: 

where, in both cases, the value used in the numerator (1.5) 
is arrived at based upon observation of the relationship 
between ascender or descender heights and the x-height. 
Also included within the StoreOutlinePairO function is an 
operation to remove the portions of the contours which do 
not represent portions of the text string. These regions lie at 
the ends of the bounding boxes illustrated in FIG. 22. For 
example, the box surrounding the word "practitioner*' in 
FIG. 22 can be seen to extend beyond the actual word image. 
As further illustrated at the ends of the word "from" in FIGS. 
26A-26D, the contour does not contain useful information. 
By removing these regions from the contour shape, less error 
will be introduced into the comparison operations. 

Subsequent to the normalization operation, standard sig- 
nal processing steps can be used to determine the similarity 
or dissimilarity of the two signals being compared. Alter- 
natively, the following equation may be used: 
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where 



, is the difference value between the two signals; 



f(x) is the known signal; and 

g'(x) is the unknown signal. 
In a simple determination, the difference could be examined 10 
and if it is close to zero, such would be indicated that there 
would be almost no difference between the two signals. 
However, the greater the amount of difference, the more 
likely that the word was not the same as the word to which 
it was being compared. 15 

It is important to note that the embodiments described 
herein, as supported by the code listings of the Appendix, 
compare the word shape contours using the upper and lower 
contours for each word in conjunction with one another. This 
is an implementation specific decision, and is not intended 20 
to limit the invention to comparisons using only the top and 
bottom contours in conjunction with one another. In fact, 
sufficient information may be contained within the upper 
contours alone so as to significantly reduce the requirements 
for a comparison of the lower contours, thereby saving 25 
considerable processing effort. 

The steps of this simplified comparison method, as first 
contemplated, are illustrated in FIG. 30. Beginning at step . 
410, the contour for the first word shape is retrieved from 
memory, and subsequently, the second word shape is 30 
. retrieved by step 412. Next, the centers of gravity of the 
word shapes, defined by the upper and lower contours, are 
determined and aligned, step 414. The purpose of this step 
is to align the centers of the word contours to reduce the 
contour differences that would be attributable solely to any 35 
relative shift between the two sets of contours being com- 
pared. The center of gravity is determined by suimning the 
areas under the curves (mass) and the distances between the 
contours (moments) which are then divided to give an 
indication of the center of gravity for the upper and lower 40 
contour pair. Once determined for both sets of contour pairs, 
the relative shift between the pairs is determined, step 416, 
and the contours are shifted prior to calculating the differ- 
ence between the contours. The shifting of the contours is 
necessary to reduce any error associated with the establish- 45 
ment of the word shape boundaries and computation of the 
word shapes at block 724 of FIG. 7. Step 418 handles those 
regions lying outside the overlapping range of the shifted 
contour pairs, determining the difference against a zero 
amplitude signal in the non-overlapping regions. This is 50 
done by surnming the squared values of the upper and lower 
contours at the non-overlapping ends of the contours. Sub- 
sequently, the overlapping region of the contours are com- 
pared, step 420. The difference in this region is determined 
as the sum of the squared differences between the upper 55 
curves and the lower curves, as shown in the function 
L2Norm() on page 100 of the Appendix. Next, the values 
returned from steps 418 and 420 are added to determine a 
sum of the differences over the complete range defined by 
the shifted contours. This value may then be used as a 60 
relative indication of the similarity between the contour 
pairs for the two word shapes being compared. 

An alternative to the center-of-gravity comparison 
method, uses a signal processing function known as time 
warping, as described in the article "Performance Tradeoffs 65 
in Dynamic Time Warping Algorithms for Isolated Word 
Recognition", by Myers, Rabiner, and Rosenberg, IEEE 



Transactions on Acoustics, Speech, and Signal Processing, 
Vol. ASSP-28, No. 6, December 1980, and the book, "Time 
Warps, Suing Edits, and Macromolecules: The Theory and 
Practice of Sequence Comparison", by Sankoff and Kruskal, 
Addison- Wesley Publishing Company, Inc., Reading, Mass., 
1983, Chapters 1 and 4, and may be used to provide for 
compression and expansion of points along the contours 
until the best match is made. Then a score is derived based 
on the amount of difference between the contours being 
compared and the stretching required to make the contours 
match. Once again, the score provides a relative indication 
of the match between the two signals being compared. 

Referring now to FIG. 31, which depicts the general steps 
of the dynamic warping method, the method relies on the use 
of a difference array or matrix to record the distances 
between each point of the first contour and points of the 
contour to which it is being compared. As illustrated in the 
figure, and detailed in the code listings contained in the 
Appendix, the process is similar for all of the measures 
which may be applied in the comparison. 

First, the organization of the code is such that a data 
structure is used to dynamically control the operation of the 
various comparison functions. The structure DiffDescriptor, 
the declaration for which is found on page 9 of the Appendix 
(see diff.h), contains variables which define the measure to 
be applied to the contours, as well as, other factors that will 
be used to control the comparison. These factors include; 
normalization of the contour lengths before comparison; 
separate comparisons for the upper and lower contours; a 
centerWeight factor to direct the warping path; a bandwidth 
to constrain the warp path; a topToBottom ratio which 
enables the top contour comparison to be weighted more or 
less with respect to the bottom contour comparison; and a 
hillTo Valley ratio to selectively control weighing the con- 
tour differences when an unknown contour is being com- 
pared to a known or model word shape contour. Interpreta- 
tion of the various factors is actually completed in the diff2.c 
module at page 56 of the Appendix, although descMain.c at 
page 49 provides an illustration of the interpretation of the 
factors. 

In general, each measure implements a comparison tech- 
nique, however, each is optimized for a specific type of 
dynamic comparison, for example, a slope limited dynamic 
warp having a non-unitary centerweight and a topToBottom 
weight greater than one. The first level of selection enables 
the use of a slope-constrained warping function for com- 
parison, an unconstrained warp, or a simple, non-warped, 
comparison. Within both of the warp comparison methods, 
there are both separate comparison functions, where the top 
and bottom contours are warped independently, and parallel 
comparison functions, where the warp is applied to both the 
top and bottom contours simultaneously. Specific details of 
the comparison functions are generally contained within the 
newMatchx file beginning at page 101 of the Appendix. 

In the general embodiment, the dynamic warping process 
starts by allocating space for the path/distance array, step 
450, which will hold the distance values generated during 
the comparison and warping of one word shape contour with 
respect to another. After allocating space, the border regions 
of the array must be initialized as the process used by all the 
warping measures is an iterative process using data previ- 
ously stored in the array for the determination of the 
cumulative difference between the contours. At step 452, the 
array borders are initialized. Initialization of the first row of 
the array entails the determination of the square of the 
difference between a first point on the first contour and each 
point on the second contour. Subsequent to border initial- 
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ization, the column and row index values, LI and L2, 
respectively, are reset to 1 to begin processing the individual, 
non-border, points along the contours. 

Processing of the contours proceeds at steps 458 through 
464, where the difference in distance between each point 5 
along the second contour, with respect to a point on the first 
contour is calculated. Moreover, this difference, or distance, 
is calculated and then summed with a previously determined 
difference value. In addition, some of the previously deter- 
mined difference values may be weighted differently, for Q 
example, in one embodiment weights of the difference 
values along the array diagonal may be modified by a 
centerWeight weighing factor. As an illustration, the opera- 
tion of the NewMatchO function, beginning at line 106 on 
page 103, at first, the distance (rest) is calculated as the sum 
of the squares of the differences between a point on the first 15 
contour and a point on the second contour, over the upper 
and lower contours, where the top contour difference is 
weighted by the topToBottom variable. This distance (rest) 
is used in subsequent iterations to determine the horizontal, 
vertical and diagonal difference values in the loop beginning 20 
at line 137 on page 103. To determine each of these values, 
the current distance value, represented by rest, would be 
added to the previous values in the down, left, and down-left 
array positions, the down-left position value being the 
diagonal position which is weighted by the centerWeight 25 
factor as previously described. Referring to FIG. 32A, which 
illustrates the positional relationship between a previously 
determined value X, at array location 502, and subsequent 
array locations, the value X might be added to the difference 
values of subsequent locations to accumulate the total dif- 30 
ference. calculations is shown. When calculating the differ- 
ence value for array location 504, the value in location 502 
would be used as the down value. Similarly, when calculat- 
ing the value in location 506, the value of location 502 
would be used as the center-weighted down-left, or diago- 35 
nal, value. After calculating the three difference values, steps 
458, 460, and 462, the process continues by selecting the 
smallest of the three values, step 464, for insertion into the 
current array position, step 466. As illustrated in the Appen- 
dix at line 144 of page 103, the FMinO function from page 40 
101 returns the minimum of the three values previously 
calculated, the value being inserted into the storage array 
pointed to by pointer dc. 

Subsequently, the process illustrated in FIG. 31 continues 
by determining the differences between the point on the first 45 
contour, represented by LI, to points on the second contour, 
represented by L2. Decision step 468 controls the iterative 
processing of the points along the second contour by testing 
for the end of the contour, or swath. In the implementation 
shown in the Appendix, the index variables i and j are used 50 
in place of LI and L2 to control the difference calculation 
loops. As indicated in the code for the NewMatch function 
beginning on page 102 of the Appendix, the swath is referred 
to as the bandwidth, and is determined by a desired band- 
width which is adjusted for the slope defined by the contour 55 
lengths (see page 102, lines 83-89). If no limit has been 
reached, processing for the next point would continue at step 
458 after the value of L2 was incremented at step 470. 
Similarly, decision step. 472 controls the processing of each 
point along the first contour, in conjunction with increment- 60 
ing step 474. Once all the points have been processed with 
respect to one another, as evidenced by an affirmative 
response in step 472, the relative difference score, best score, 
is contained in the farthest diagonal position of the array 
(LI, L2). Subsequently, the value determined at step 476 is 65 
returned as an indication of the dynamically warped differ- 
ence between the contours being compared. 



The code implementation found in the NewMatchO func- 
tion on page 103 of the Appendix has optimized the execu- 
tion of the aforedescribed warping process by reducing the 
large two-dimensional array to a pair of linear arrays which 
are updated as necessary. Due to this modification, the 
minimum difference, or best score, for the warp comparison 
value is found in the last location of the one-dimensional 
array. Furthermore, the final difference value, dc, may be 
subsequently normalized to account for the length differ- 
ences between the two sets of contours being compared. 
Finally, such a value might subsequently be compared 
against a threshold or a set of similarly obtained difference 
values to determine whether the contours arc close enough 
to declare a match between the words, or to determine the 
best match from a series of word shape comparisons. 

In yet another embodiment, the dynamic time warping 
process previously described may be altered to compare the 
difference values contained in the difference array to a 
threshold value on a periodic basis. Upon comparison, the 
process may be discontinued when it is determined that 
sufficient difference exists to determine that the contours 
being compared do not match one another, possibly saving 
valuable processing time. Moreover, the sequential opera- 
tion of word shape comparator 726 might be done in 
conjunction .with sequential output from word shape com- 
puter 724, thereby enabling the parallel processing of a 
textual image when searching for a keyword. 

Having described a basic implementation of the dynamic 
warping comparison measures, the distinctions of the other 
dynamic warp comparison methods included in the Appen- 
dix and the application of the control factors previously 
mentioned will be briefly described to illustrate the numer- 
ous possible embodiments of the present invention. First, the 
method previously described may also be implemented with 
the slope of the warp path being constrained as it moves 
across the array. Details of the implementation are found in 
the SlopeCMatchO function beginning on page 111 of the 
Appendix. This measure is further illustrated graphically in 
FIG. 32B, where the value of array location 512, X, may be 
added to only the three subsequent array locations shown. 
For example, X may be added to array location 514, when 
considered as the d2Ll value for location 514. The nomen- 
clature used for the variable names, and followed in the 
figure, is as follows: d2Ll refers to the array location which 
is down 2 rows and left one column, dlLl, refers to the 
lower left diagonal array location, and dlL2 refers to the 
array location that is down one column on left 2 rows from 
the current array location. In a similar manner, X may be 
added as the dlL2 value for the calculation of the cumulative 
difference value for array location 516. 

As is apparent from a comparison of FIGS. 32A and 32B, 
the slope constrained warping measure limits the warping 
path which can be followed during the generation of the 
cumulative difference value. The reason for implementing 
such a constraint is to prevent the warping process from 
removing, or compressing, a large area of one of the two 
contours being compared, without imposing a significant 
"cost" to such a compression. 

Next, the method previously described with respect to the 
parallel warping process may also be implemented on only 
one pair of contours at a time, for example, the upper 
contours of two word shapes. The functions SepMatch() and 
SepCMatchO, as found in the Appendix on pages 104 and 
113, respectively, implement the separate matching measure 
in both the non-slope-constrained and slope-constrained 
fashions previously described. In general, these measures 
separately calculate the difference between the top or bottom 
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contours of a pair of wordshapes. The general implementa- 
tion indicated for the measures in the code shows that these 
measures are typically used sequentially, first determining 
the warped difference for the top contours, and then adding 
to it the warped difference from the bottom contour com- 
parison, resulting in a total difference for the wordshapes. 

By carrying out the comparison methods described in a 
"piece-wise" cascaded fashion, further processing benefits 
may also be derived. More specifically, cascaded compari- 
son would entail, first, utilizing the upper contours of the 
words being compared to identify a word, or at least narrow 
the set of possible alternatives and, second, using the lower 
contour comparison to provide complete identification. It is 
believed that such an approach to word shape comparison 
operation 726 would considerably reduce processing time 
spent on identifying unknown word shapes by comparison to 
a dictionary of known word shapes, 728, as illustrated in 
FIG. 7. Important to the cascaded comparison, is the con- 
straint that the top and bottom warps applied to the contours 
must be relatively equivalent. This requirement arises from 
the fact that the upper and lower curves have a relationship 
to a common word, and if this relationship is not maintained 
during the warp analysis, the accuracy of the comparison 
will be compromised. 

Alternatively, the dynamic warping technique may be 
applied as described, with the addition of a function suitable 
for accumulating the relative warp applied to the upper and 
lower curves in achieving the best match. For example, 
when a known, non-italicized word shape is compared to an 
unknown word shape, a shift in the warp applied to the upper 30 
curve relative to the lower curve could be indicative of an 
italicized word, however, the length of the warped region 
will remain the same for the top and bottom warps. Such a 
technique may prove useful in the identification of important 
words within a larger body of text, as these words are 
occasionally italicized for emphasis. 

One of the control factors which has not been previously 
described is the bandwidth factor. As implemented, the 
bandwidth factor controls the relative width of the signal 
band in which the warping signal will be constrained. More 
specifically, the band width limitation is implemented by 
defining a region about the array diagonal in which the warp 
path which traverses the array is constrained. The constraint 
is implemented by assigning large values to those areas 
outside of the band width, so as to make it highly unlikely 
that the path would exceed the constraint. 

Another factor which was briefly mentioned is the top- 
ToBonom factor. When applied, the value of this variable is 
used to weight the difference value determined for the top 
contour warping process. Therefore, use of a number greater so 
than one, will cause the upper contour difference to be 
weighted more heavily than the lower contour difference. A 
very large number would effectively eliminate the lower 
contour difference completely and, likewise, a zero value 
would eliminate the upper contour difference completely. 
This factor is generally considered important to enable the 
upper contour to be weighted in proportion to its information 
content, as it generally carries more information regarding 
the word than does the lower contour. 

The hillToValley ratio is a variable which is usually 
applied in situations when a known, or model, set of word 
shape contours is being compared against a set of word 
shape contours from an unknown image. In exercising this 
option, the model set of contours is passed as the comparison 
measure functions, for example, NewMatchO on page 102 
of the Appendix. When determining the difference between 
points on the contours, the comparison functions commonly 
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call the function SquareDifferenceO on page 101 of the 
Appendix to determine the sum of the squared difference. 
SquareDifferenceO applies the hillToValley ratio to the 
squared difference whenever it determines that the value of 
the model contour is less than the contour being compared. 
The result of applying a hillToValley value greater than one 
is that the relative "cost" of the difference when the model 
contour is less than the target contour is smaller than the 
same difference when the model contour is greater than the 
target contour. The basis for this type of weighing is that 
when comparing against a model contour, the comparison 
should treat those areas of the target contour that are subject 
to being "filled in" during a scanning or similar digitizing 
operation with less weight than regions not likely to be filled 
in, as evidenced by contour positions below the model 
contour. For instance, the regions where ascenders and 
descenders meet the body of the character are likely to be 
filled in during scanning, thereby causing the target contour 
to have a gradual contour in those regions, whereas the 
model contour would most likely have a defined peak or 
valley in these regions. Hence, the contour value of the 
model would be less than the contour value of the target, 
even though the characters may have been identical. There- 
fore, the hillTbValley variable attempts to minimize the 
impact to the calculated difference value over these regions. 

It is important to note that the aforedescribed measures 
and control factors allow the comparison measures to be 
conducted in numerous permutations. However, the flexibil- 
ity which these measures permit is intended to enhance the 
applicability of the comparison process, so that when infor- 
mation is known about a particular word shape contour, for 
example, a model contour generated from a computer gen- 
erated character font, the measures may place reliance on 
that information to make the comparisons more robust. 

The mathematical explanation of the word shape deriva- 
tion process suggests that alternative methods of deriving 
the word shape signal exist Some possible alternatives are 
the establishment of the one dimensional signal using an 
alternative coordinate scheme, for example polar coordi- 
nates. Another possibility is generation of signal g(t), where 
g(t) represents the direction from each contour point to the 
succeeding contour point, where t would represent the point 
number. 

Depending on the particular application, and the relative 
importance of processing speed versus accuracy, for 
example, comparisons of different degrees of precision can 
be performed. For example, useful comparisons can be 
based on length, width or some other measurement dimen- 
sion of the image unit (or derived image unit shape repre- 
sentation, c.g., the largest figure in a document image); the 
location or region of the image unit in the document (includ- 
ing any selected figure or paragraph of a document image, 
e.g., headings, initial figures, one or more paragraphs or 
figures), font, typeface, cross-section (a cross-section being 
a sequence of pixels of similar state in an image unit); the 
number of ascenders; the number of descenders; the average 
pixel density; the length of a top line contour, including 
peaks and troughs; the length of a base contour, including 
peaks and troughs; the location of image units with respect 
to neighboring image units; vertical position; horizontal 
inter-image unit spacing; and combinations of such classi- 
fiers. Thus, for example, if a selection criteria is chosen to 
produce a document summary from titles in the document, 
only tide information in the document need be retrieved by 
the image analysis processes described above. On the other 
hand, if a more comprehensive evaluation of the document 
contents is desired, then more comprehensive identification 
techniques would need to be employed. 



10/24/2003, EAST Version: 1.4.1 



5,491,760 



31 



32 



In addition, morphological image recognition techniques 
such as those disclosed in concurrently filed U.S. patent 
application Ser. No. 07/775,174, to Bloomberg et al., and 
entitled "Methods and Apparatus for Automatic Modifica- 
tion of Selected Semantically Significant Portions of a 
Document Without Document Image Decoding", can be 
used to recognize specialized fonts and typefaces within the 
document image. 

More particularly, the above reference provides a method 
for automatically emphasizing selected information within 
the data or text of a document image. Referring to FIG. 9, 
the first phase of the image processing technique of the 
method involves the segmentation of the image into unde- 
cpded information containing image units (step 920) using 
techniques described above. Then the locations of and 
spatial relationships between the image units on a page is 
determiried (step 925), which was previously described 

The discrimination step 930, which was previously 
described, is next performed to identify the image units 
which have insufficient information content to be useful in 
evaluating the subject matter content of the document being 
processed. Such image units include stop or function words, 
i.e., prepositions, articles and other words that play a largely 
grammatical role, as opposed to nouns and verbs that convey 
topic information. 

Next, in step 940, selected image units, e.g., the image 
units not discriminated in step 930, are evaluated, without 
decoding the image units being classified or reference to 
decoded image data, based on an evaluation of predeter- 
mined morphological (structural) image characteristics of 
the image units. The evaluation entails a determination (step 
941) of the morphological image characteristics and a com- 
parison (step 942) of the determined morphological image 
characteristics for each image unit. The determined mor- 
phological image, characteristic(s), e.g. ( the derived image 
unit shape representations, of each selected image unit are 
compared, either with the determined morphological image 
characteristic(s)/derived image unit shape representations of 
the other selected image units (step 942A), or with prede- 
termined/user-selected morphological image characteristics 
to locate specific types of image units (step 942B). The 40 
determined morphological image characteristics of the 
selected image units are advantageously compared with each 
other for the purpose of identifying equivalence classes of 
image units such that each equivalence class contains most 
or all of the instances of a given image unit in the document, 45 
and the relative frequencies with which image units occur in 
a document can be determined. 

It will be appreciated that the specification of the mor- 
phological image characteristics for titles, headings, cap- 
tions, linguistic criteria or other significance indicating 50 
features of a document image can be predetermined and 
selected by the user to determine the selection criteria 
defining a "significant" image unit. Comparing the image 
characteristics of the selected image units of the document 
image for matches with the image characteristics associated 
with the selection criteria permits the significant image units 
to be readily identified without any document decoding. 

Any of a number of different methods of comparison can 
be used. One technique that can be used, for example, is by 
correlating the raster images of the extracted image units 
using decision networks, such technique being described for 
characters in a Research Report entitled "Unsupervised 
Construction of Decision Networks for Pattern Classifica- 
tion" by Casey et al., IBM Research Report, 1984, incor- 
porated herein in its entirety. 

Another techniques that can be used to identify equiva- 
lence classes of word units are the word shape comparison 
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techniques disclosed in U.S. patent application Ser. Nos. 
07/796,119 and 07/795,169, filed concurrently herewith by 
Huttenlocher and Hopcroft, and by Huttenlocher, Hopcroft 
and Wayner, respectively, and entitled, respectively, "Opti- 
cal Word Recognition By Examination of Word Shape " and 
"Method for Comparing Word Shapes." This method pro- 
vides an adequate comparison for purposes of determining 
phrase frequency is to compare only the length and height of 
the derived image unit shape representations. Such a com- 
parison is particularly fast, resulting in a highly efficient 
phrase frequency analysis which has proven to be suffi- 
ciently robust to reliably extract significant phrases in many 
text document applications. 

In instances in which multiple page documents are pro- 
cessed, each page is processed and the data held in the 
memory 15 (see FIG. 1), as described above. The entirety of 
the data can then be processed. 

The second phase of the document analysis according to 
this method involves further processing (step 950) of the 
scanned document image to emphasize the' identified image 
units. The emphasis can be provided in numerous ways. One 
exemplary way is to augment the document image so that the 
identified significant image units are underscored, high- 
lighted with color, or presented as margin notations. 

Another exemplary way is to modify the shape and/or 
other appearance attributes of the significant image units 
themselves in a manner which emphasizes them relative to 
the other image units in the document image. The appear- 
ance modification can be accomplished using any conven- 
tional image modification techniques, or, advantageously, 
the following morphological bitmap modification tech- 
niques. 

In accordance with this method, one or more selected 
morphological operations are performed uniformly on the 
entire bitmap for a selected image unit to modify at least one 
shape characteristic thereof. It will be appreciated that the 
selection of bitmap operations may be performed automati- 
cally or interactively. 

Examples of ways in which the appearance changes 
described above can be accomplished are as follows. The 
type style text can be "boldened" by either "dilation" or 
using a connectivity-preserving (CP) thickening operation. 
It can be "lightened" by either "erosion" or a CP thinning 
operation. (As will be appreciated by those skilled in the art, 
dilation and erosion are morphological operations which 
map a source image onto an equally sized destination image 
according to a rule defined by a pixel pattern called a 
structuring element (SE). A SE is defined by a center 
location and a number of pixel locations, each having a 
defined value (ON or OFF). The pixels defining the SE do 
not have to be adjacent each other. The center location need 
not be at the geometrical center of the pattern; indeed it need 
not even be inside the pattern. In a dilation, a given pixel in 
the source image being ON causes the SE to be written into 
the destination image with the SE center at the correspond- 
ing location in the destination image. The SEs used for 
dilation typically have no OFF pixels. In an erosion, a given 
pixel in the destination image is turned ON if and only if the 
result of superimposing the SE center on the corresponding 
pixel location in the source image results in a match between 
all ON and OFF pixels in the SE and the underlying pixels 
in the source image.) 

Such dilation/thickening and erosion/thinning operations 
can be either isotropic (the same horizontally for vertically) 
or anisotropic (e.g., different in horizontal and vertical 
directions). 

Although optical character recognition (OCR) techniques 
are required, for example, in order to convert the typestyle 
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of a selected word unit to italic, a similar type of emphasis 
can be achieved through the morphological operation of 
horizontal shearing to achieve the slant typestyle. Slant is a 
variant of roman type style that is created from roman using 
a horizontal shear of about 12 degrees (this is the appro xi- s 
mate slant angle of italic style characters). The sheared 
images can slant forwards, backwards, or even upwards, if 
desired. Text can also be bit inverted (black for white and 
vice versa) for emphasis, or words can be emphasized or 
de-emphasized by scaling up or down, respectively. In the 10 
case of scaling, it may also be desirable to change the 
thickness of the lines in the image unit in addition to simple 
scaling. 

TTius, using such morphological bitmap alteration pro- 
cesses, hand marks such as underlining, side lining, circling, 15 
highlighting, and so forth, can be extracted from the image, 
and removed from the original bitmap by XOR operations. 
Removal of color highlight marks requires capture of a gray 
scale (or color) scanned image. Once captured, removal is 
relatively easy using the appropriate thresholding. The 20 
resulting image is similar in quality to that of un-highlighted 
marks. Words that are high-lighted can be identified from the 
highlight mask and word boxes, using known seed-growing 
methods. The appearance of these words can be altered at 
will. 25 

A salient feature provided by the method of the invention 
is that the initial processing and identification of significant 
image units is accomplished without an accompanying 
requirement that the content of the image units be decoded, 
or that the information content of the document image 30 
otherwise be understood. More particularly, to this stage in 
the process, the actual content of the word units is not 
required to be specifically determined. Thus, for example, in 
such applications as copier machines or electronic printers 
that can print or reproduce images directly from one docu- 35 
ment to another without regard to ASCII or other encoding/ 
decoding requirements, image units can be identified and 
processed using one or more morphological image charac- 
teristics or properties of the image units. The image units of 
unknown content can then be further optically or electroni- 40 
cally processed. One of the advantages that results from the 
ability to perform such image unit processing without hav- 
ing to decode the image unit contents at this stage of the 
process is that the overall speed of image handling and 
manipulation can be significantly increased. 45 

The second phase of the document analysis of the inven- 
tion involves processing (step 50) the identified significant 
image units to produce an auxiliary or supplemental docu- 
ment image reflective of the contents of the source document 
image. It will be appreciated that the format in which the 50 
identified significant image units are presented can be varied 
as desired. Thus, the identified significant image units could 
be presented in reading order to form one or more phrases, 
or presented in a listing in order of relative frequency of 
occurrence. Likewise, the supplemental document image 55 
need not be limited to just the identified significant image 
units. If desired, the identified significant image units can be 
presented in the form of phrases including adjacent image 
units presented in reading order sequence, as determined 
from the document location information derived during the 60 
document segmentation and structure detennination steps 20 
and 25 described above. Alternatively, a phrase frequency 
analysis as described above can be conducted to limit the 
presented phrases to only the most frequently occurring 
phrases. 65 

The present invention is similarly not limited with respect 
to the form of the supplemental document image. One 



application for which the information retrieval technique of 
the invention is particularly suited is for use in reading 
machines for the blind. One embodiment supports the des- 
ignation by a user of key words, for example, on a key word 
list, to designate likely points of interest in a document. 
Using the user designated key words, occurrences of the 
word can be found in the document of interest, and regions 
of text forward and behind the key word can be retrieved and 
processed using the techniques described above. Or, as 
mentioned above, significant key words can be automati- 
cally selected according to prescribed criteria, such as fre- 
quency of occurrence, or other similar criteria, using the 
morphological image recognition techniques described 
above; and a document automatically summarized using the 
determined words. 

Another embodiment supports an automatic location of 
significant segments of a document according to other 
predefined criteria, for example, document segments that are 
likely to have high informational value such as titles, regions 
containing special font information such as italics and 
boldface, or phrases that receive linguistic emphasis. The 
location of significant words or segments of a document may 
be accomplished using the morphological image recognition 
techniques described above. The words thus identified as 
significant words or word units can then be decoded using 
optical character recognition techniques, for example, for 
communication to the blind user in a Braille or other form 
which the blind user can comprehend. For example, the 
words which have been identified or selected by the tech- 
niques described above can either be printed in Braille form 
using an appropriate Braille format printer, such as a printer 
using plastic-based ink; or communicated orally to the user 
using a speech synthesizer output device. 

Once a condensed document is communicated, the user 
may wish to return to the original source to have printed or 
hear a full text rendition. This may be achieved in a number 
of ways. One method is for the associated synthesizer or 
Braille printer to provide source information, for example, 
"on top of page 2 is an article entitled ..." The user would 
then return to point of interest. 

Two classes of apparatus extend this capability through 
providing the possibility of user interaction while the con- 
densed document is being communicated. One type of 
apparatus is a simple index marker. This can be, for instance, 
a hand held device with a button that the user depresses 
whenever he or she hears a title of interest, or, for instance, 
an N-way motion detector in a mouse 19 (FIG. 2) for 
registering a greater variety of commands. The reading 
machine records such marks of interest and returns to the 
original article after a complete summarization is commu- 
nicated. 

Another type of apparatus makes use of the technology of 
touch-sensitive screens. Such an apparatus operates by 
requiring the user to lay down a Braille summarization sheet 
41 on a horizontal display. The user then touches the region 
of interest on the screen 42 in order to trigger either a full 
printout or synthesized reading. The user would then indi- 
cate to the monitor when a new page was to be processed. 

It will be appreciated that the method of the invention as 
applied to a reading machine for the blind reduces the 
amount of material presented to the user for evaluation, and 
thus is capable of circumventing many problems inherent in 
the use of current reading technology for the blind and 
others, such as the problems associated with efficient brows- 
ing of a document corpus, using synthesized speech, and the 
problems created by the bulk and expense of producing 
Braille paper translations, and the time and effort required by 
the user to read such copies. 
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The present invention is useful for forming abbreviated 
document images for browsing (image gists). A reduced 
representation of a document is created using a bitmap 
. image of important terms in the document. This enables a 
user to quickly browse through a scanned document library, 5 
either electronically, or manually if, summary cards are 
printed out on a medium such as paper. The invention can 
also be useful for document categorization (lexical gists). In 
this instance, key terms can be automatically associated with 
a document. The user may then browse through the key 
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terms, or the terms may be further processed, such as by 
decoding using optical character recognition. 

Although the invention has been described and illustrated 
with a certain degree of particularity, it is understood that the 
present disclosure has been made only by way of example, 
and that numerous changes in the combination and arrange- 
ment of parts can be resorted to by those skilled in the art 
without departing from the spirit and scope of the invention, 
as hereinafter claimed. 
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Jul 26 19:28 1991 args.h 



1 /* Support for command line argument scanning. 

2 * 

3 * When a program is run from the shell, its name is followed by a number of 

4 * required command line ARGUMENTS and then some optional command line OPTIONS. 

5 * Each argument consists of a list of required PARAMETERS, each of which can 

6 * be either an int, string, or float. Options are like arguments with the 

7 * exception that their required parameters are predeced by a keyword denoting 

8 * which option is being invoked. 
9 

10 * Required arguments are defined using the DefArg function. The format string 

1 1 * consists of a list of data format specifiers (%d, %f, and %s for integer, float, and 

1 2 * string, respectively) that specify the types of the parameters to the arguement, 

13 * The documentation string should contain a one line description of the argument. 

14 * It will be printed if the argument list cannot be scanned. 

15 * The remaining arguments to DefArg are pointers to locations where the values of the 

16 * command line arguments will be stored. 

17 * 

18 * Optional arguments are defined with the DefOption function. The format string 

19 * is similar to the DefArg format string, but has a keyword before the format 

20 * specifiers. The exists parameter is a pointer to BOOLEAN that is set to true 

21 * iff an occurence of this option was successfully parsed from the command line. 

22 * The remaining arguments are pointers to the locations where the values of the 

23 * command line arguments will be stored. 

24 * 

25 * Short example: 

26 * The following program expects one required command line argument that is a string 

27 * and wi II be stored ins. In addition, it will accept three different optional 

28 * keyword arguments. They are the keyword -int followed by an integer, with result 

29 * stored in i; -float followed by a float stored in f; and -pair followed by a float 

30 * and then an int, stored in f and i, respectively. 

31 * 

32 * Suppose the program is called foo. Here are some legal invocations: 

33 * % foo hello 

34 * % foo hello -int 1 

35 * % foo hello -int 5 -float 10 

36 *% foo hello -pair 12 
37 

38 * Here are some error invocations and responses 

39 * % foo 

40 * Usage: 

41 * scanArgs 

42 * filename 

43 * (-int <int>] 

44 * [-float <float>] 

45 * [-pair <float> <int>] 

46 * % foo hello -int 

47 * Option -int expects 1 parameters: 

48 * -int <int> 

49 * 

50 * 

51 *void main (int argc,char **argv) 
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52 *{ 

53 * int i; 

54 * float f; 

55 * char's; 

56 * BOOLEAN haveAString.haveAnlntjhaveAFloa^haveAPair; 
57 

58 * DefArgr%sVfiienarne tt ( &s); 

59 * DefOptionC-int %dV-int <int> ".ahaveAnlnt&i); 

60 * DefOption{ ,, -float^f u l "-float <float> w ,&haveAFIoat,&f); 

61 * DefOption("-pair%f %d\ M -pair <float> <int>" # &haveAPair ( &f ( &i); 
62 

63 * ScanArgs(argcargv); 
64 

65 * printf("%s\n M ,s); 

66 * if (haveAPair) 

67 * printf("%f %d\n M ,f,i); 

68 * if (haveAnlnt) 

69 * printf("%d\n n f i); 

70 * if (haveAFloat) 

71 * printfC%f\n\f); 

72 * if (haveAString) 

73 * printfr%s\n\s); 

74 *} 

75 * 

76 */ 
77 

78 /* Possible additions: 

79 * 1) Passing NULL pointers as exists variables. 

80 * 2) Predicate calculus for error checking. 

81 * 3) Only need one DefArg call. 

82 * 4) Combine with error.c to save program name info. 

83 */ 

84 void DefArg(char *format,char 'documentation,...); 

85 void DefOption(char *format,char *documentation,BOOLEAN 'exists,...); 

86 void ScanArgs(int argcchar "argv); 
87 

88 
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Jan 11 17:001991 baselines.h 



1 List BaseLines(Picture pict, double angle,char *plotFtle); 

2 #ifdeffoo 

3 int*count, 

4 int **returnCoordx, int ** return Coo rdy); 

5 #endif 

6 void DrawBaseLines(Picture pict, List pointList, double angle); 
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Section A APPENDIX / Page 4 

Aug 23 13:03 1991 blobify.h 

1 Picture Blob'rfy(Picture old.int half_mask_size,double threshold); 

2 Picture NewBlobify(Picture old.int halfMaskWidth,double threshold.double angle); 
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Section A APPENDIX / Page 5 

Aug 1 02:59 1991 boolean.h 

1 typedef int BOOLEAN; 

2 #define FALSE 0 

3 #defineTRUE (IFALSE) 
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Section A APPENDIX / Page 6 

Jan 11 17:001991 boxes.h 



1 List FindBorders(Picture pict,doubie theta); 

2 void DrawBox(P1cture pict,Box box); 

3 void DrawColorBox(Picture pict,Box box/tnt color); 
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Section A APPENDIX / Page 7 

Jul 26 13:42 1991 descriptors.h 

I typedef unsigned char *Descriptor,DescriptorElement; 
2 

3 void PrintField(char *s,int w); 

4 void PrintDescriptor(Descriptord,int *starCount,int *correctCount); 

5 void PrintWords(char **words,int numberOfWords); 

6 Descriptor ComputeDescriptor(int modellndex,Dictionary models, 

7 Dictionary thisFont,int numberOfWords, 

8 Diff Descriptor dd); 
9 

10 #def ine MAX_FONTS (20) 

II #defineMAX_WORDS(100) 
12 

13 
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Jan 16 12:55 1991 dict.h 



1 /* Dictionary file have the following format: 

2 * int magic number = 1234567 

3 * int numberOf Entries 

4 * int infoString Length (includes the \0 at the end) 

5 * charinfoStringlinfoStringLength] 

6 * OutlinePairBodylnumberOfEntries] 

7 */ 
6 

9 typedef struct { 

10 Box box; 

1 1 float blackoutHeight; 

12 int numberOf Legs; 

13 int offset; 

14 int width; 

15 float *x; 

16 float *top; 

17 float *bottom; 

18 } *OutlinePair,OutlinePairBody; 
19 

20 typedef struct { 

21 Box box; 

22 int numberOfLegs; 

23 int*x; 

24 int *top; 

25 int*bottom; 

26 } *RawOutlinePair,RawOutlinePairBody; 
27 

28 typedef struct { 

29 int numberOf Entries; 

30 char*infoString; 

31 RawOutlinePair *rawOutlines; 

32 OutlinePair *outlines; 

33 } *Dictionary,DictionaryBody; 
34 

35 void WriteDictionary(Dictionary diet, char *f ilename); 

36 Dictionary ReadDiaionary(char *filename); 

37 Dictionary NewDict(int numberOf Entries); 

38 char *ArgListToString(int argc, char **argv); 
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Jul 30 23:04 1991 diff.h 



I typedef enum {L2, CONSTRAIN ED, WARP} DiffType; 
2 

3 typedef struct { 

4 DiffType difffype; 

5 BOOLEAN lengthNormalize; 

6 BOOLEAN separate; 

7 float centerWeight; 

8 int bandwidth; 

9 float topToBottom; 

10 float hillToValley; 

II FILE *pathFP; 

12 } *DiffDescriptor,DiffDescriptorBody; 
13 

14 Picture CompareDictionaries(Dictionary dictl. Dictionary dtct2,Diff Descriptor dd); 

15 void WritePictureAsAscii(Picture pict, char filename, 

16 char *info1, char *info2); 

17 float DiffPair(OutlinePair one, OutlinePair two,Diff Descriptor dd); 

18 #ifdeffoo 

19 float DiffPairAndPath(OutlinePair one, OutlinePair two,DiffDescriptor dd); 

20 #endif 
21 

22 
23 
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Jan 15 18:56 1991 diff2.h 



1 #ifdef OWNER 

2 #define EXTERN 

3 #eke 

4 #def ine EXTERN extern 

5 # end if OWNER 
6 

7 EXTERN int FileCountX; 

8 EXTERN int FileCountY; 
9 

10 float DiffPair(OutlinePair one, OutlinePairtwo, char *matchtype, 

11 char*pathFile); 
12 
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Jul 26 19:29 1991 error.h 



1 /* Possible additions: 

2 * 1) Variable numbers of parameters to DoErrorO- 

3 * 2) Error recovery language. 

4 V 

5 void DoError(char *string1,char *string2); 
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Aug 15 06:37 1991 fontNorm.h 



1 void StoreRawOutlinePair(Dictionary diet, intdictEntry, 

2 Box box/tnt *bothX,int *topY, int *baseY f 

3 int numberOfLegs); 
4 

5 #define HITJfHEJJOX (10000) 
6 
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Jan 11 17:00 1991 lines.h 



1 typedef BOOLEAN pi$tFunc(Picture pict, int x, int y, BOOLEAN test, 

2 UCHAR color); 
3 

4 pistFunc DrawPiston, CountPiston, DistancePiston, BaseLinePiston; 
5 

6 void LineEngine(Picture pict, intxl, intyl, int x2, inty2, UCHAR color, 

7 pistFunc PerPixel); 

8 void DrawLine(Picture pict, int x1, int y1, int x2, int y2, UCHAR color); 

9 float CountLine(Picture pict, intxl, int yl , int x2, int y2); 

10 int DistanceLine(Picture pict, intxl, int y1, int x2, int y2); 
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Jan 11 17:001991 lists.h 



1 typedef struct { 

2 void *car; 

3 void*cdr; 

4 }cellBody,*cell; 
5 

6 typedef cell List; 

7 typedef void *mapFun(void *); 

8 typedef void collectFun(void *); 
9 

10 Listcdr(List); 

1 1 void *car(List); 

12 void *poplntem(List *); 

13 BOOLEAN endp(List); 

14 List cons(void Mist); 

15 void map(List,mapFun); 

16 Listcoliect(List,collectFun); 

17 int ListLength(List I); 
18 

19 #define push(aj) ((I) = cons((a),{l))) 

20 #def ine pop(l) (poplntern(&(l))) 

21 #define nil ((List)NULL) 
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Jan 15 18:39 1991 match.h 



1 # if def OWNER 

2 #define EXTERN 

3 #else 

4 #define EXTERN extern 

5 #endif OWNER 
6 

7 EXTERN int debug; 
8 

9 typedef struct { 

10 float cost; 

1 1 int xptr; 

12 intyptr; 

13 }eJt; 
14 

1 5 #def ine MAXSEQLENGTH 800 
16 

17 float DPDiffPair(OutlinePair one, OutlinePair two); 

18 float matchvecs(float *Vec1, int lenVecl, float *Vec2, int lenVec2); 

19 float sq_distance(f loat x1 , float x2); 

20 float best_score (elt *arrayQ[MAXSEQLENGTH] / int lenVecl, int lenVec2); 

21 void print J>est_path(elt *arrayl)IMAXSEQLENGTHL int lenVecl , int lenVec2, 

22 char*pathFile); 

23 void print_array_costs(elt *array[]lMAXSEQLENGTH], int lenVed, int ienVec2); 

24 void print array dirs(elt *arrayQ [MAXSEQLENGTH], int lenVed, int lenVec2); 
25 

26 /* 

27 #ifndef debug 

28 #def ine debug FALSE 

29 #endif 

30 */ 

31 #ifndef horweight 

32 #def ine horweight 1.5 

33 #endif 



34 
35 
36 
37 
38 
39 
40 
41 



#ifndef verweight 
#define verweight 1.5 
#endif 



#ifndef diagweight 
#definediagweight 1.0 
#endif 
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Jan 15 18:47 1991 matchparallel.h 



1 float plJ>PDiff Pair{OutlinePair one, OutlinePair two, char *pathFile); 

2 float pf matchvecs(f loat *Vedt, float *Vedb, int lenVed, float *Vec2t, float *Vec2b, int 
lenVec2, char *pathFile); 

3 

4 float faster_pl_DPDiffPair(OutlinePair one, OutlinePair two, char *pathFile); 

5 float fasterlpf matchvecs(fIoat *Vedt, float *Ved b, int lenVed , float *Vec2t, float *Vec2b, 
int lenVec2, char *pathFile); 

6 

7 float simplejDl_DPDiffPair(OutlinePair one, OutlinePair two); 

8 float simpleIpCmatchvecs(f loat *Vedt, float *Vedb, int lenVed , float *Vec2t, float 
*Vec2b,intlenVec2); 
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Jul 9 16:01 1991 misch 



I /* 

2 

3 * misch - miscellaneous types and declarations 

4 * 

5 */ 
6 

7 /* Some library routines that never seem to get declared */ 
8 

9 /* Memory allocation functions */ 

10 extern void *malloc(unsigned size); 

I I extern void *calloc(unsigned nelem, unsigned elsize); 

12 extern void *realloc(void *p, unsigned size); 

1 3 exte rn void f ree(void * p); 
14 

15 /* I don't feel like including setjmp.h */ 

16 /* 

17 extern int_setjmp(jmp_buf env); 

18 extern volatile void longjmpOmp buf env, int val); 

19 */ 
20 

21 /* String-to-X functions */ 

22 extern int atoi(char *s); 

23 extern double atof (char *s); 
24 

25 /* String functions */ 

26 extern int strcmp(char *s1, char *s2); 

27 extern intstrncmp(char *s1, char *s2, int n); 

28 extern char *strcpy(char *d, char *s); 

29 extern char *stmcpy(char *d, char *s, int n); 

30 intstrlen(char*s); 

31 extern char *strdup(char *); 

32 extern char *strchr(char *s,char c); 
33 

34 /*stdio functions*/ 

35 extern int fclose(FILE *stream); 

36 extern int f read(char *ptr, int size, int nitems, FILE *stream); 

37 extern int fwrite(char *ptr, int size, int nitems, FILE *stream); 

38 /* these a re necessary to avoid implicit declarations */ 

39 extern int JlsbufO; 

40 extern int filbufO; 
41 

42 /* Formatted I/O functions */ 

43 extern int printf(char *format, ...); 

44 extern intscanf(char *format, ...); 

45 extern int fprintf (FILE *stream, char *format, ...); 

46 extern int f scanf (FILE *stream, char *f ormat, ...); 
47 

48 /* and misc stuff */ 

49 extern volatile void exit(int val); 
50 

51 extern void perror(char *s); 
52 
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Aug 1 02:59 1991 mylib.h 



1 #include* , error.h M 

2 #include' , boolean.h n 

3 #include"lists.h M 

4 #include "args.h" 

5 #include"pict.h" 

6 #include"read.h" 
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Aug 15 06:36 1991 newContour.h 



1 void BoxToShell(Picture pict.Box box,list baselinePoints, 

2 Dictionary dict,int dictEntry,NormalizationDescriptor *nd); 

3 void BarBoxList(Picture pi ct, List boxLlst,Ust baseLinePoints, 

4 char *filename, char *infoString, NormalizationDescriptor *nd); 
5 

6 
7 
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Jul 31 17:11 1991 newMatch.h 



1 extern float hiliToValley; 

2 extern float L2Compare(OutlinePair o1,OutlinePair o2,float topToBottom); 

3 extern float NewMatch(float *a1,float *a2,int aLength f float *b1 f f loat *b2,int bLength, 

4 float centerWeight,BOOLEAN lengthNormalizeJnt normalBandWidth, 

5 float topToBottom); 

6 extern float SepMatch(f loat *a1,int aLength,float *b1,int bLength, 

7 float centerWeight,BOOLEAN lengthNormali2e,int normalBandWidth); 

8 extern float NewMatchAndPath(f loat *a1,float *a2,int aLengthffloat *b1,f loat *b2, 

9 int bLength,f loat centerWeight.BOOLEAN lengthNormalize.int normalBandWidth, 

1 0 float topToBottom,FILE *f p); 

1 1 extern float SlopeCMatch(float *a1 ( float *a2,int aLength,f loat *b1,f loat *b2,int bLength, 

12 float centerWeight,BOOLEAN lengthNormalize,float topToBottom); 

13 extern float SepSIopeCMatch(float *a1,int aLength,float *b1, int bLength, 

14 float centerWeight,BOOLEAN lengthNormalize); 

15 extern float SlopeCMatchAndPath(float *a afloat *a2,int aLength r float *b1, float *b2 f 

16 int bLength,f loat centerWeight,BOOLEAN lengthNormalize/f loat topToBottom, 

17 FILE *pathFP); 
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Jan 11 17:001991 numbers.h 

1 void DrawNumber(Picture pict, int x, int y, int color, float scale, int n); 
2 
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Jan 14 16:52 1991 orient.h 



1 BOOLEAN Coarse(Picture pict, int coarseSamples, int coarseDirections, 

2 float *orientation, char *p!otFile); 
3 

4 float Fine(Picture pict,int f ineSampIes, int fineDirections, 

5 int coarseDirections, float coarseAngle, char *plotFile); 
6 

7 float NewFine(Picture pictjnt fineSamples, int fineDirections, 

8 float angleStart.f loat angleEnd, char *plotFile); 
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Aug 23 19:191991 pict.h 



1 typedef unsigned char UCHAR; 
2 

3 #define ROUND8(x) ((x%8)?(x+8-x%8):x) 

4 #define ROUND 16(x) ((x% 16)?{x+ 16-x% 16):x) 

5 #define ROUND2(x) <{x%2)?(x+ 1):x) 
6 

7 typedef int Color; 

8 #define COLORADO 

9 #defineCOLOR_GREEN 1 

10 #define COLOR BLUE 2 
11 

1 2 typedef struct cmapstruct { 

13 int numberOfEntries; 

14 UCHAR* red; 

15 UCHAR *green; 

16 UCHAR *blue; 

17 } ColorMapBody, *ColorMap; 
18 

19 typedef struct pstruct{ 

20 int width; 

21 int height; 

22 int depth; 

23 intuchar_width; 

24 ColorMap cmap; 

25 UCHAR *data; 

26 } PictureBody, *Picture; 
27 

28 void dberror(char *string1,char *string2); 

29 

30 

31 ColorMap NewCo!orMap(int size); 

32 void FreeColorMap(ColorMap cmap); 

33 UCHAR ReadColorValue(ColorMap cmap, Color primary,int index); 

34 UCHAR WriteColorValue(ColorMap cmap, int index, UCHAR red, UCHAR green, 

35 UCHAR blue); 

36 Picture new_pict(int width,int height,int depth); 

37 void f ree_pict(Picture pict); 

38 Picture load_pict(char *filename); 

39 Picture load_header(F(LE *fp); 

40 void write_pict(char *f ilename,Picture pict); 

41 void write_header(FlLE *fp, Picture pict); 

42 /* int BytesPerScanline(Picture pict); */ 

43 #defineBytesPerScanline(pict)(pict->uchar width) 
44 

45 UCHAR ReadPixeKPicture pict,int x,int y); 

46 void WritePixe!(Picture pict,int xjnt y,int color); 

47 void WriteClippedPixel(Picture pict,int x,int y,int color); 

48 void CopyPicture(Picture dest r Picture src); 
49 
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Jul 26 13:09 1991 read.h 



1 intReadlnt(FILE*fp); 

2 intReadFloat(FILE*fp); 

3 char *ReadString(FILE *fp); 
4 
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Aug 15 00:19 1991 types.h 



1 typedef struct { 

2 BOOLEAN noAscenderNormalize; 

3 BOOLEAN noXHeightNormalize; 

4 } NormalizatlonDescriptor; 
5 

6 typedef struct { 

7 intx; 

8 inty; 

9 int width; 

10 int height; 

11 intpageX; 

12 intpageY; 

13 double angle; 

14 }BoxBody,*Box; 
15 

16 typedef struct { 

17 intx; 

18 inty; 

19 } PointBody,*Point; 
20 

21 Box MakeBox(int x f int yjnt width,int height,double angle); 

22 Point MakePoint(int x,int y); 
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Jul 26 13:25 1991 Makefile 



I CCFLAGS = -g -c -l/net/pig let/pigleMc/ho per oft/ new/include 
2 

3 INCLUDE = /net/piglet/piglet-1c/hopcrofty new/include/ 
4 

5 ARGS = $(INCLUDE)args.h 

6 BOOLEAN = ${INCLUDE)boolean.h 

7 ERROR = $(INCLUDE)error.h 

8 LISTS - $(INCLUDE)lists.h 

9 MISC = $(INCLUDE)misc.h 

10 Pia = $(INCLUDE)pkt.h 

I I READ = $(INCLUDE)read.h 
12 

13 OFUNS = args.o error.o pirt.o lisU.o read.o 
14 

15 mylib.a:$(OFUNS) 

16 ld-r$(OFUNS)-omylib.a 
17 

1 8 args.o: args.c $ (BOOLEAN) $(ERROR) $(MISC) $(ARGS) 

19 gcc$(CCFLAGS) args.c 
20 

21 error.o: error.c$(ERROR) 

22 gcc$(CCFLAG5)error.c 
23 

24 pict.o: pict.c$(BOOLEAN)$(ERROR)$(PICT) 

25 gcc$(CCFLAGS)pictc 
26 

27 lists.o: lists.c S(BOOLEAN) $(LISTS) 

28 gcc$(CCFLAGS) lists.c 
29 

30 read.o: read.c$(MISC)$(READ) 

31 gcc$(CCFLAGS)read.c 
32 

33 
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Jul 26 13:23 1991 args.c 



1 #include <stdio.h> 

2 #include <stdarg.h> 

3 #include "error.h" 

4 #include "boolean.h" 

5 #include "misc.h" 

6 #include tt args.h" 
7 

8 #define MAX_NAM E_LENGTH (50) 

9 #def ine MAX PARAMETERS (6) 

10 #define MAX_OPTIONS (20) 

1 1 #define MAX_ARGS (20) 
12 

13 typedef enum {INTEGER,FLOAT,STRING} ParamType; 

15 typedef struct { 

16 char *documentation; 

17 int numberOf Parameters; 

18 ParamType types[MAX_PARAMETERS]; 

19 void *values[MAX_PARAMETERS]; 

20 }*Arg,ArgBody;. 
21 

22 typedef struct { 

23 charoptionName[MAX_NAME_LENGTH + 1]; 

24 char *documentation; 

25 BOOLEAN *exists; 

26 int numberOf Parameters; 

27 ParamType types[MAX PARAMETERS); 

28 void *va)ues[MAX_PARAMETERS]; 

29 } *Option l OptionBody; 
30 

31 static BOOLEAN optionsRequired = TRUE; 

32 static int numberOf Arguments = 0; 

33 static ArgBody args[MAX_ARGS]; 

34 static int numberOfOptions = 0; 

35 static OptionBody options[MAX_OPTIONS]; 
36 

37 void Def Arg(char *format,char *documentation,..,) 

38 { 

39 vajist ap; 

40 char*p; 

41 int i; 

42 intparameterCounter; 
43 

44 if (numberOf Arguments = = MAX_ARGS) 

45 DoError("Def Arg: too many command line options now:\"%s\ M .\n , \format); 
46 

47 args[numberOfArguments].documentation = documentation; 
48 

49 /* now parse the format string */ 

50 /* get option parameters */ 

51 va_start(ap ( documentation); 

52 for(p = format,parameterCounter==0;*p;p + +){ 
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53 if (*p= 

54 if (parameterCounter = = MAX.PARAMETERS) 

55 DoError("Def Arg: too many parameters inr^sV^^format); 

56 p++; 

57 switch (*p){ 

58 case'd': 

59 args[numberOfArguments].types[parameterCounter] = INTEGER; 

60 args[numberOfArguments].valueslparameterCounter] = va_arg(ap, void *); 

61 parameterCounter + + ; 

62 break; 

63 case'f: 

64 args[numberOfArgumente].types[parameterCounter] = FLOAT; 

65 args[numberOf Arguments].values[parameterCounter] = va_arg(ap, void *); 

66 parameterCounter+ + ; 

67 break; 

68 case's': 

69 args[numberOfArguments].types[parameterCounter] = STRING; 

70 args[numberOfArgumenfc].values[parameterCounter] = va_arg(ap # void *); 

71 parameterCounter + +; 

72 break; 

73 default; 

74 DoErrorfDef Arg: bad option in \"%s\ u An ".format); 

75 } . 

76 } 

77 } 

78 args(numberOfArguments].numberOf Parameters = parameterCounter; 

79 + +nurnberOf Arguments; 

80 va_end(ap); 

81 } 
82 

83 void Def Option(char *format,char *documentation,BOOLEAN *exists,...) 

84 { 

85 vajist ap; 

86 char *optionName; 

87 char*p; 

88 inti; 

89 int parameterCounter; 
90 

91 if (numberOfOptions= = MAX.OPTIONS) 

92 DoError( M DefOption: too many command line options now:\ l, %s\ M .\n , \format); 
93 

94 /* record exists so that *exists will be TRUE if this option is scanned */ 

95 optionslnumberOfOptionsJ.exists = exists; 
96 

97 options[numberOfOptions].documentation = documentation; 
98 

99 /* now parse the format string */ 

100 p= format; . 

1 01 /* skip leading spaces */ 

1 02 while (*p = = " && *pl = '\0') 

103 p++; 
104 

105 /* get the option name */ 

106 optionName = options[numberOfOptions].optionName; 

107 i = 0; 
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108 while (*p! = \0'&&*p != "&&*p l='\t'){ 

1 09 if (i < MAX_NAME_LENGTH) 

110 optionNameIi++J = *p; 

111 else 

112 DoErrorf DefOptions: option name too long in \ w %s\" An "format); 

113 p++; 

114 } 

115 optionName[i] = '\0'; 
116 

117 /* get option parameters */ 

1 1 8 va_start(ap,exists); 

119 for(parameterCounter=0;*p;p + +){ 

120 if (*p= ="*'){ 

121 if (parameterCounter = = MAX_PARAMETER5) 

122 DoError( ,, DefOptions: too many parameters in \ u %s\ "An "format); 

123 p++; 

124 switch (*p){ 

125 case'd': 

126 options[numberOfOptionsl.types[parameterCounter] = INTEGER; 

127 options[numberOfOptions].values[parameterCounter] = va w arg{ap, void *); 

1 28 parameterCounter + + ; 

129 break; 

130 case'f: 

131 options[numberOfOptions].types[parameterCounter] = FLOAT; 

132 options[numberOfOptions].values(parameterCounter] = va_arg(ap, void *); 

133 parameterCounter + + ; 

134 break; 

135 case's': 

136 optionslnumberOfOptions].types[parameterCounterJ = STRING; 

137 options[numberOfOptions].values[parameterCounter] = va_arg(ap, void *); 

138 parameterCounter+H-; 

139 break; 

140 default: 

141 DoErrorCDefOptions: bad option in \ M %s\ "An 11 .format); 

142 } 

143 } 

144 } 

145 options[numberOfOptions].numberOfParameters = parameterCounter; 

146 ++numberOfOptions; 

147 va end(ap); 

148 } 
149 

150 voidPrintHelp(char *name) 

151 { 

152 inti; 

153 fprintf(stderr,"U$age:\n %s\n\name); 

1 54 for (i = 0; i < numberOfArguments; + + i) 

155 fprintf(stderr/ %s\n M ,args[i].documentation); 

156 for (i = 0;i<numberOf Options; + +i) 

157 fprintf(stderr/ (%s]\n l, ( options[i). documentation); 

158 DoError( M \n M # NULL); 

159 } 
160 

161 void ScanArgs{int argcchar **argv) 

162 { 
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163 inti,j,k; 
164 

1 65 for (j = 0;j <numberOf Options; + + j) 

1 66 *{options[j].exist5) = FALSE; 
167 

1 68 if (argc = = 1 && optionsRequired) 

169 PrintHelp(argvlO)); 
170 

171 i = 1; 

172 for (j = 0; j < numberOf Argu ment$; + + j) { 

173 if (i 4- args[j].numberOf Parameters > argc) { 

174 fprintf(stderr, "Required argument expects %d parameters:\n %s\n M , 

175 argsQ]. numberOf Parameters, 

1 76 args(j].documentation); 

177 DoError( w \n",NULL); 

178 } 

179 for(k = 0;k<args(j].numberOfParameters;+ +k) 

180 switch (args(j].types[k)) { 

181 case INTEGER: 

182 *(int *)(argsjj).values[k]) = atoi(argv[i+ +]); 

183 break; 

184 case FLOAT: 

185 *(float*){args[j].values[k]) = atof(argv[i + +J); 

186 break; 

187 case STRING: 

188 *(char**)(argslj].values(k]) = argv[i+ +]; 

189 break; 

190 default: 

191 DoError( u ScanArgs: internal error - bad type An", NULL); 

192 } 

193 } 
194 

195 while (i<argc){ 

196 for(j = 0;j<numberOfOptions; + +j) 

197 if (!strcmp(optionsO].optionNarne,argvli])) { 

1 98 if (i + options[jJ. numberOf Parameters > = argc) { 

199 fprintf(stderr ( "Option %s expects %d parameters:\n %s\n w ( 

200 options[j].optionName ( 

201 optlonslj]. numberOf Parameters, 

202 optionslj]. documentation); 

203 DoError( ,, \n" J NULL); 

204 } 

205 *(optionsIj]. exists) = TRUE; 

206 ++i; 

207 for(k = 0;k<optionslj].numberOfParameters; + +k) 

208 switch (options[j].type$[k]) { 

209 , case INTEGER: 

210 *(int *)(options[j].values[k]) = atoi(argv[i+ +]); 

211 break; 

212 case FLOAT: 

213 *(float *)(options[j].values[k]) = atof(argv[i-*- +]); 

214 break; 

215 case STRING : 

216 *(char**)(options[j].values[k]) = argv[i++]; 

217 break; 
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218 default: 

219 DoError("ScanArgs: internal error- bad type.\n",NULL); 

220 } 

221 break; 

222 } 

223 if (j = = numberOfOptions) { 

224 fprintf(stderr,"Bad command lineargumentAn"); 

225 PrintHelp(argvlO]); 

226 } 

227 } 

228 } 

229 #ifdeffoo 

230 void main (intargc l char**argv) 

231 { 

232 int i; 

233 float f; 

234 char*s; 

235 BOOLEAN haveAString,haveAnlnt,haveAFIoat,haveAPair; 
236 

237 DefArg( ,, %s N # "filename w l &s); 

238 DefOption( ,l -int%d ,, f "-int <int>",&haveAnlnt,&i); 

239 DefOption("-float %f "/-float <float> ",&haveAFIoat,&f); 

240 DefOptionC'-pair %f %d M f "-pair <float> <int> w ,&haveAPair # &f i &i); 
241 

242 ScanArgs(argcargv); 
243 

244 printf( M %s\n ,l .s); 

245 if (haveAPair) 

246 printf( w %f %d\n" ( fj); 

247 if (haveAnlnt) 

248 printfC%d\n",i); 

249 if (haveAFIoat) 

250 printf( M °/of\n M ,f); 

251 if (haveAString) 

252 printf{ M %s\n",$); 

253 } 

254 #endif 
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Jul 26 12:57 1991 error.c 



1 #include <stdio.h> 

2 #inc!ude "error.h 11 
3 

4 void DoError{char * string 1, char *string2) 

5 { 

6 if(string2-=NULL) 

7 printf(string1); 

8 else 

9 printf (string 1.string2); 

10 exit{-1); 

11 } 
12 
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Jul 26 12:57 1991 lists.c 



1 #include "stdio.h* 

2 #include "boolean. h" 

3 #include "lists.h" 
4 

5 List cdr(Ust i) 

6 { 

7 rf (I = = NULL) 

8 return I; 

9 else 

10 return l->cdr; 

11 } 
12 

13 void *car(Listl) 

14 { 

15 if (| = = NULL) 

16 return I; 

17 else 

18 return [-> car; 

19 • } 
20 

21 void *poptntern(List *l) 

22 { 

23 List temp; 

24 if (*l == NULL) 

25 return*!; 

26 else{ 

27 temp = (*!)-> car; 

28 *l = (*l)->cdr; 

29 return temp; 

30 ) 

31 } 
32 

33 BOOLEAN endp(List I) 

34 { 

35 return (I == NULL); 

36 } 
37 

38 List cons(void *theCar,List theCdr) 

39 { 

40 cell temp; 

41 temp = (cell)calloc(1,sizeof(cellBody)); 

42 if (temp == NULL){ 

43 printf ("Cons: out of memory\n w ); 

44 exit(-l); 

45 } 

46 temp->car = theCar; 

47 temp->cdr = theCdr; 

48 return temp; 

49 } 
50 

51 void map(List l.mapFun f) 

52 { 
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53 while {11= NULL) { 

54 (*f)<l->car); 

55 I - l->cdr; 

56 } 

57 } 
58 

59 List collect(List l,collectFun c) 

60 { 

61 List temp; 

62 while(l! = NULL){ 

63 <*c)(l->car); 

64 temp = I; 

65 ! = l->cdr; 

66 free(temp); 

67 } 

68 } 
69 

70 intListLength(Listl) 

71 { 

72 int count = 0; 

73 while(l!=NULL){ 

74 ++ count; 

75 l-I->cdr; 

76 } 

77 return count; 

78 } 
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Aug 23 19:201991 pictc 



1 #indude <stdio.h> 

2 #include <math.h> 

3 #include <rasterfile.h> 

4 #include "boolean.h" 

5 #include "error.h" 

6 #include "picth" 
7 

8 static UCHAR bitmasksf] = { 0x80,0x40,0x20,0x10,0x8,0x4,0x2,0x1 }; 
9 

10 CoiorMap NewColorMap(intsize) 

11 { 

12 CoiorMap cmap; 

13 if (size > 256) 

14 DoErrorCNewColorMap: size greaterthan 256. U ,NULL); 

15 if (size < 1) 

16 DoErrorCNewColorMap: size lessthan 1/,NULL); 

17 if ((cmap = (Color Map)calJoc(1,sizeof(ColorMapBody))) - = NULL) 

18 DoErrorCNewColorMap: cannot allocate space. H ,NULL); 

19 cmap->numberOfEntries = size; 

20 cmap- > red = (UCHAR *)calloc(size f sizeof (UCHAR)); 

21 cmap->green = (UCHAR *)calloc(size,sizeof(UCHAR)); 

22 cmap->blue = (UCHAR *)caltoc(size,sizeof (UCHAR)); 

23 if ((cmap->red = = NULL)||(cmap->green = = NULL)||(crnap->blue == NULL)) 

24 DoErrorCNewColorMap: cannot allocate space. M ,NULL); 

25 return cmap; 

26 } 
27 

28 void FreeColorMap(ColorMap cmap) 

29 { 

30 if (cmap I = NULL) { 

31 if (cmap->redl= NULL) 

32 free(cmap->red); 

33 if (cmap->green 1= NULL) 

34 free(cmap-> green); 

35 if (cmap->bluel= NULL) 

36 free(cmap->blue); 

37 free(cmap); 

38 } 

39 } 
40 

41 UCHAR ReadColorValue(ColorMap cmap, Color primary, int index) 

42 { 

43 if (index > cmap- >numberOf Entries) 

44 DoErrorCReadColorValue: index too big. M , NULL); 

45 if (primary = COLOR.RED) 

46 return *(cmap-> red + index); 

47 if (primary = COLOR.GREEN) 

48 return *(cmap-> green + index); 

49 If (primary = COLOR.BLUE) 

50 return *(cmap-> blue + index); 

51 DoErrorCReadColorValue: bad primary color, H ,NULL); 

52 } 
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UCHAR WriteColorVaiue(ColorMap cmap, int index, UCHAR red, UCHAR green, 
UCHARblue) 



if (index > cmap->numberOf Entries) 
DoError( w WriteColorValue: index too big.\NULL); 
*{cma p- > red + index) = red; 
*(cma p- > green + index) = green; 
*(cmap->blue + index) =blue; 



Picture newj>ict(width,height,depth) 

int width,height,depth; 

{ 

Picture pict; 
int uchar_width; 

if {(pict = (Pi ctu re) cailoc(1,sizeof (Picture Body))) == NULL) 

DoError("new_pict: cannot allocate space u ,NULL); 
pict->width = width; 
pict- > height = height; 
pict->depth = depth; 
pict- > cmap = NULL; 
rf(pict->depth == 32) 

uchar_width = pict->width*4; 
else if (pict->depth = = 8) 

uchar.width = ROUND2(pict->width); 
else if (pict->depth -= 1) 

ucharjwidth = ROUND16(pict-> width) >> 3; 
else 

DoError( u new_pict; only depths of 1 and 8are supported\n u ,NULL); 
pict->uchar_width = ucharj/vidth; 

pict->data = (UCHAR *) calloc(uchar_width * pict->height ( si ze of (UCHAR)) ; 
if(pict->data = = NULL) 

DoError( w new_pict: cannot allocate space\n",NULL); 
return pict; 

} 

void f ree_pict(pict) 
Picture pict; 
{ 

if (pict->data! = NULL) 
free(pict->data); 
FreeColorMap(pict-> cmap); 
free(pict); 

} 

Picture loadj)ict(fn) 

char *fn; 

{ 

FILE *fp; 
Picture pict; 
intuchar_width; 
struct rasterfile header; 
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108 

109 if ((pict = (Picture)calloc(1,sizeof(PictureBody))) = = NULL) 

110 DoError("load_pict: cannot allocate space",NULL); 
111 

112 if ((fp = fopen(f n, V)) = = NULL) 

1 13 DoErrorCload pict: error opening input file %$\n M # fn); 
114 

115 /* WARNING - this f read is VERY unsafe! If assumes that the C compiler 

116 * puts all fields of a structure adjacent. This is not always the case. 

117 * It appears that it works with gcc on a sparcstation, but may not work 

118 *on other systems. */ 

1 19 fread(&header,sizeof(struct rasterfile), 1,fp); 

1 20 if (header.ras_magic I = RAS _M AG IC) 

121 DoError("load_pict: only supports rasterfile format\r»'\NULL); 

1 22 if ((header.ras_fype ! = RT_STANDARD)|| 

1 23 (header.ras jnaptype ! = RMT.NONE) || 

1 24 (header.ras_maplength I = 0)) 

125 DoError("load pict: unsupported rasterfile format\n M ( NULL); 
126 

127 pict- > width = header, ras_width; 

128 pict-> height = header.rasjieight; 

129 pict->depth = header.ras liepth; 
130 

131 if (pict-> depth = = 32) 

132 uchar_width = pict->width * 4; 

133 else if (pict- > depth ==8) 

134 uchar_width = ROUND2(pict-> width); 

135 else if (pict- > depth ==1) 

136 ucharj/vidth = ROUND 16(pict-> width) >> 3; 

137 else 

1 38 DoErrorCload_pict: only depths of 1, 8 ( and 32 are supported\n M ,NULL); 

139 pict->uchar_width = uchar width; 
140 

141 pict->data = (UCHAR *) calloc(uchar_width * pict->height,sizeof(UCHAR)); 

142 if (pict->data = = NULL) 

143 DoError("load_pict: cannot allocate space\n", NULL); 
144 

145 fread(pict->data, sizeof(UCHAR), uchar width *pict->height, fp); 

146 fclose(fp); 

147 return pict; 

148 ) 
149 

150 Picture load header(FILE *fp) 

151 { 

152 Picture pict; 

153 int uchar_width; 

1 54 struct rasterfile header; 
155 

156 if ((pict = (Picture)calloc(1,sizeof(PictureBody))) = = NULL) 

157 DoError( M load_header: cannot allocate space",NULL); 
158 

1 59 /* WARNING - this f read is VERY unsafe! If assumes that the C compiler 

1 60 * puts all fields of a structure adjacent. This is not always the case. 

161 * It appears that it works with gcc on a sparcstation, but may not work 

1 62 * on other systems. */ 
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163 if (fread(&header,sizeof(struct rasterfile), 1,fp) I = 1) 

164 DoError( M load_header: error reading header",NULL); 

165 if (header.rasjnagic I = RAS.MAGIC) 

166 DoError("load_pict: o nly supports rasterfile f ormat\n'\NULL); 

1 67 if ((header.ras Jype ! = RT_STANDARD)|| 

168 (header.ras jnaptype I = RMT_NONE) || 

169 (header.rasjnaplength ! = 0)) 

170 DoErrorfload pict: unsupported rasterfile format\n M ,NULL); 
171 

172 pict- > width = header.ras_width; 

173 pict->height — header.ras_height; 

174 pict->depth = header.ras depth; 
175 

176 if(pict->depth==32) 

177 uchar_width = pict-> width *4; 

178 else if (pict-> depth = = 8) 

179 uchar_width = ROUND2(pict-> width); 

180 else if (pict- > depth = = 1) 

181 uchar width = ROUND16(pict->width) > > 3; 

182 else 

183 DoErrorCload.header: only depths of 1,8,and32aresupported\n" ( NULL); 

184 pict->uchar_width = uchar_width; 

185 pict- > data = NULL; 
186 

187 return pict; 

188 } 
189 

190 void write_pict(f n, pict) 

191 char*fn; 

192 Picture pict; 

193 { 

194 FILE *fp; 

195 int uchar_width; 

196 struct rasterfile header; 
197 

198 if((fp=fopen(fn i ,, w"))==NULL) 

199 DoError("write_pict: error opening output file %$\n*\fn); 
200 

201 header.ras_magic = RAS.MAGIC; 

202 header, ras_width = pict-> width; 

203 header.ras Jieight = pict-> height; 

204 header.ras_depth = pict->depth; 

205 header.rasjength = pict->uchar_width*pict-> height; 

206 header.ras Jype = RTSTANDARD; 

207 if (pict->cmap = = NULL) { 

208 header.ras^maptype = RMTJJONE; 

209 header.ras_maplength = 0; 

210 /* WARNING - this f write is VERY unsafe! If assumes that the C compiler 

211 * puts all fields of a structure adjacent This is not always the case. 

212 * It appears that it works with gcc on a sparcstation, but may not work 

213 * on other systems. */ 

214 if (fwrite(&header f sizeof(struct rasterfile), 1 ( fp) != 1) 

215 DoError< M write pict: error writing header W ,NULL); 

216 ) 

217 else{ 
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218 header.rasjmaptype = RMTJEQUAL.RGB; 

219 header.ras_mapiength = pict->cmap->nurnberOfEntrie$*3; 

220 /* WARNING" this fwrite is VERY unsafe! If assumes that the C compiler 

221 * puts all fields of a structure adjacent This is not always the case. 

222 * It appears that it works with gcc on a sparcstation, but may not work 

223 . * on other systems. */ 

224 if (fwrite(&head er,sizeof (struct rasterf ile),1 ,fp) != 1) 

225 DoErrorrwrite_pict: errorwriting header\NULL); 

226 fwrite(pict->cmap->red,si2eof(UCHAR) # pirt->cmap->numberOfEntries,fp); 

227 fwrite(pirt->cmap^>green f si2eof(UCHAR) f pict->cmap->numberOfEntries ( fp); 

228 fwrite(pict->OTiap->blue l sizeof(UCHAR) ( pict->cmap->numberOfEntries # fp); 

229 } • 
230 

231 uchar_width = pict->uchar_width; 

232 fwritelplct->data, sizeof(UCHAR), uchar_width*pict->height, fp); 

233 fclose(fp); 

234 } 
235 

236 void write_header(FILE *fp, Picture pict) 

237 { 

238 struct rasterfile header; 
239 

240 header.ras_magic = RASJvlAGIC; 

241 header. ras_width = pict->width; 

242 header.rasjieight = pict-> height; 

243 header.ras_depth = pict- > depth; 

244 header.rasjength = p ict->uchar_width* pict- > height; 

245 header.rasltype = RT_STANDARD; 

246 header.ras_maptype = RMT_NONE; 

247 header.ras^maplength = 0; 

248 /* WARNING - this fwrite is VERY unsafe! If assumes that the C compiler 

249 * puts alt fields of a structure adjacent This is not always the case. 

250 * It appears that it works with gcc on a sparcstation, but may not work 

251 * on other systems. */ 

252 fwrite(&header,s!zeof {struct rasterf ile)J,fp) ; 

253 ) 
254 

255 #defineBytesPerScanline(pict)(pict->uchar.width) 
256 

257 UCHAR ReadPixel(pict,x,y) 

258 Picture pict; 

259 intx,y; 

260 { 

261 if (pict-> depth == 8) 

262 return *(pict->data + y*Byte$Per$canline(pict)+x); 

263 else if (pict- > depth == 1) 

264 return ((*(pict->data+y*BytesPerScanline(pict) + (x>>3))) & 

265 bitmasks[x%8])?1:0; 

266 else 

267 DoErrorCReadPixel: only depths of 1 and 8 are supported\n",NULL); 

268 } 
269 

270 void WritePixel(pict,x,y,color) 

271 Picture pict; 

272 intx,y; 
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273 UCHAR color; 

274 { 

275 if (x<0||x> =pict->width||y<0||y> =pict-> height) { 

276 chars[256]; 

277 sprintf(s # "%d%d u ,x,y); 

278 DoErrorfWritePixel: Out of bounds: ",5); 

279 } 

280 if (pict-> depth = = 8) 

281 *(pict->data + y*pict->uchar_width+x) = color; 

282 else if (pict-> depth = = 1) 

283 if (color) 

284 *(pict->data+y*BytesPerScanline(pict)+(x> >3)) |= bitmasks[x%81; 

285 else 

286 *(pirt->data + y*BytesPerScanline{pict)+(x>>3))&= *bitmasks[x%8]; 

287 else 

288 DoErrorrWritePixel: onlydepthsof 1 and8aresupported\n w ,NULL); 

289 } 
290 

291 void WriteClippedPixel(pict,x,y,color) 

292 Picture pict; 

293 int x,y; 

294 UCHAR color; 

295 { 

296 if (x<0||x> = pict- > width|ly < 0||y > = pict- > height) { 

297 return; 

298 } 

299 if (pict-> depth == 8) 

300 *(pict->data + y*pict->uchar_width+x) = color; 

301 else if (pict->depth = = 1) 

302 if (color) 

303 *(pict->data+y*BytesPerScanline(pict)+(x>>3)) |= bitmasks[x%8); 

304 else 

305 *(pict->data+y*BytesPerScanline(pict) + (x>>3)) &= -bitmasks[x%8); 

306 else 

307 DoErrorfWritePixel: only depths of 1 and 8aresupported\n\NULL); 

308 } 
309 

310 void CopyPicture(Picture dest, Picture src) 

311 { 

312 int uchar_width; 

313 dest->wfdth = src- > width; 

314 dest->height = src->height; 

315 dest->depth - src->depth; 

316 dest->uchar_width = BytesPerScanlinefsrc); 

317 ucharj/vidth = BytesPer5canline(src); 

3 1 8 memcpy(dest- > data r src- > data,u char width *src-> heig ht); 

319 } 
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Jul 26 13:15 1991 read.c 



1 #include <stdio.h> 

2 #include*misc.h" 

3 #include M read.h" 
4 

5 #def ine MAX STRING LEN (255) 
6 

7 intReadlnt(FILE*fp) 

8 { 

9 char s[MAX_STRING_LEN]; 

10 intx; 
11 

12 fgets(s,MAX STRING LEN.fp); 

13 while (sscanfts/'yod^&x)^!) 

14 fprintf(stderr,"Readint: integer expected - reenterArT); 

15 return x; 

16 } 
17 

18 intReadFloat(FILE*fp) 

19 { 

20 char s[MAX_STRINGJJEN]; 

21 float x; 
22 

23 fgets(s,MAX STRING LEN,fp); 

24 while (sscanf(s/%f",&x)! = 1) 

25 fprintf(stderr/ReadFloat: integer expected - reenter. \n M ); 

26 return x; 

27 } 
28 

29 char *ReadString(FILE *fp) 

30 { 

31 char s[MAX_STR1NG_LEN]; 

32 char*endPtr; 
33 

34 fgets(s,MAX_STRING_LEN,fp); 

35 endPtr = strchr(s,'\n'); 

36 if (endPtr! = NULL) 

37 *endPtr = '\0'; 

38 return strdup(s); 

39 } 
40 
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Aug 13 00:13 1991 Makefile 



I CCFLAGS = -g -c -l/net/pig let/pigleMc/hopcroft/new/indude 
2 

3 EXTRNS = /net/piglet/plglet-1c/hopcroft/error/error.o\ 

4 /net/piglet/pigleMc/hopcroft/new/pict/picto \ 

5 /net/piglet/pigIet-1c/hopcroft/lists/lists.o 
6 

7 ARGS_MODULE = /net/piglet/ pigleMc/hopcroft/new/ScanArgs/args.o 
8 

9 SOURCES = Makefile diff2.c dmain.c !2Norm2.c match.c matchparallel.c single.c 

10 EXTRNSOURCES = /net/piglet/piglet-1c/hopcroft/error/error.c\ 

I I /net/pigIet/pig!et-1c/hopcroft/new/pict/pictc\ 
12 /net/piglet/pigIet-1c/hopcroft/lists/lists.c 

13 
14 

15 INCLUDE = /net/piglet/piglet-1c/hopcroft/new/inciude/ 

16 ARGS = $(INCLUDE)args.h 

17 BASELINES = $(INCLUDE)baselines.h 

18 BLOBiFY = $(INCLUDE)blobify.h 

19 BOOLEAN = $(lNCLUDE)booleaah 

20 BOXES = $(INCLUDE)boxes.h 

21 CONTOUR = $(INCLUDE)newContour.h 

22 DESCRIPTORS = $(INCLUDE)descriptors.h 

23 DICT = $(INCLUDE)dict.h 

24 DIFF = $(INCLUDE)diff.h 

25 DIFF2 = $(INCLUDE)diff2.h 

26 ERROR = $(INCLUDE)error.h 

27 LINES = $(INCLUDE)llnes.h 

28 LISTS = $<INCLUDE)Msts.h 

29 MATCH = S(INCLUDE)match.h 

30 MATCHPARALLEL = $(INCLUDE)matchparaliel.h 

31 MISC = $(INCLUDE)mlsc.h 

32 MYLIB = S(INCLUDE)mylib.h 

33 NEWMATCH = $(INCLUDE)newMatch.h 

34 ORIENT = $(INClUDE)orient.h 

35 PICT = $(INCLUDE)pict.h 

36 READ = $(lNCLUDE)read.h 

37 TYPES. = $(INCLUDE)types.h 
38 

39 INCSOURCES = $(BASELINES) ${BLOBlFY) $(BOOLEAN) S(BOXES) $(CONTOUR) \ 

40 $(DICT) $(DIFF) $(DIFF2) ${UNES) $(LISTS) $( MATCH) $( MATCHPARALLEL) \ 

41 S(ORIENT) ${PICT) ${TYPES) 
42 

43 anomalies: anomalles.o diff2.o newMatch.o ../main/dict.o 

44 gcc anomalies.o dif f2.o newMatch.o ../main/dicto $(EXTRNS) -Im -o $@ 
45 

46 descriptors: descMain.o descriptors.© diff2.o newMatch.o newL2.o ,./main/dict.o 

47 gcc descMain.o descriptors.o diff 2.0 newMatch.o newL2.o ../main/dict.o ../lib/mylib.a 
-Im -o $@ 

48 

49 drawBlobs: drawBlobs.o ../main/dict.o 

50 gcc drawBlobs.o ../main/dicto ../lib/mylib.a -Im -o $@ 
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51 

52 compare: diff2.o dmain.o newMatch.o ../main/dict.o 

53 gcc dmain.o diff 2,o newMatch.o ../main/dicto \ 

54 $(EXTRNS)-lm-o$@ 
55 

56 equivrequiv.o descriptors.o diff2.o newMatch.o newL2.o ../main/dicto 

57 gcc equiv.o descriptors.o dlff2.o newMatch.o newL2.o ../main/dict.o ../lib/mylib.a -Im 
-o $@ 

58 

59 . extract: extract.o ../main/dict.o 

60 gcc extract.o ../main/dicto $(EXTRNS) -o $@ 
61 

62 l2Norm; l2Norm2.o ../main/dict.o 

63 gcc l2Norm2.o ../main/dicto $(EXTRNS) -Im -o $@ 
64 

65 recogDesc: recogDesco ../main/dicto diff2.o newMatch.o newL2.o 

66 gcc recogDesco ../main/dict.o diff 2.o newMatch.o newL2.o ../lib/mylib.a -Im -o $@ 
67 

68 resample: resample.o ../main/dicto 

69 gcc resample.o ../main/dicto $(EXTRNS) -Im -o $@ 
70 

71 single: single.o newMatch.o diff2.o newL2.o ../main/dicto 

72 gcc single.o newMatch.o diff2.o newL2.o ../main/dict.o ../lib/mylib.a -Im -o $@ 
73 

74 sortMatrix: sortMatrix.o 

75 gcc sortMatrix.o $(EXTRNS) -o $@ 
76 

77 printAII: printlncludes printExtrnsprintCode 
78 

79 printCode: $(SOURCES) 

80 /usr/5bin/pr -n3 $(SOURCES) | Ipr -PWeeklyWorldNews 
81 

82 printExtrns: S(EXTRNSOURCES) 

83 /usr/5bin/pr-n3 ${EXTRNSOURCES) | Ipr -PWeeklyWorldNews 
84 

85 printlncludes: $(INCSOURCES) 

86 /usr/5bin/pr-n3 $(INCSOURCE5) | Ipr -PWeeklyWorldNews 
87 

88 anomalies.o: anomalies.c$(ERROR) ${TYPES) $(PICT) $(DICT) $(DIFF) $(MISC) 

89 gcc $(CCFLAGS) anomalies.c 
90 

91 descriptors.o: descriptors^ $(MYLIB) $(TYPES) $(DICT) $(DIFF) $(MISC) $(DESCRIPTORS) 

92 gcc $(CCFLAGS) descriptors.c 
93 

94 descMain.o: descMain.c $(MYLIB) $(TYPES) $(DICT) $(DIFF) $(0ESCRIPTORS) 

95 gcc $(CCFLAGS) descMain.c 
96 

97 diff 2.o: diff2.c $(BOOLEAN) $(TYPES) $(PICT) $(DIFF2) $(NEWMATCH) 

98 gcc S(CCFLAGS) diff 2.c 
99 

100 dmain.o: dmain.c S(BOOLEAN) $(PICT) $(DIFF) 

101 gcc ${CCFLAGS) dmain.c 
102 

1 03 drawBlobs.o: drawBlobs.c$(MYLIB) $(TYPES) S(DICT) 

1 04 gcc $(CCFLAGS) drawBlobs.c 
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1 06 equiv.o: equiv.c $<MYLIB) $(TYPES) S(DICT) $(DIFF) $(DESCRIPTORS) 

107 gcc$(CCFLAG5) equiv.c 
108 

1 09 extract.o: extract.c $(BOOLEAN) $(TYPES) $(0ICT) 

1 10 gcc $(CCFLAGS) extract.c 
111 

1 12 !2Norm2.o: 12Norm2.c $(BOOLEAN) $(TYP£S) $(£RROR) $(DICT) 

1 13 gcc $(CCFLAGS) I2Norm2.c 
114 

1 1 5 match.o: match.c $(BOOLEAN) $(TYPES) $(PICT) ${D1FF2) $(MATCH) $( MATCH PARALLEL) 

116 gcc$(CCFLAGS)match.c 
117 

1 18 matchparallel.o: matchparallel.c$(BOOLEAN) $(TYPES) $(PICT) $(DIFF2) \ 

1 19 $(MATCH) $(MATCH PARALLEL) 

1 20 gcc S(CCFLAGS) matchparallel.c 
121 

122 newL2.o: newL2x$(BOOLEAN)$(ERROR)$(TYPES)$(DICT) 

123 gcc $(CCFLAGS) newL2.c 
124 

125 newMatch.o: newMatch.c $(ERROR) $(MISC) $(NEWMATCH) $(DICT) $(TYPES) 

1 26 gcc ${CCFLAGS) newMatch.c 
127 

1 28 recogDesc.o: recogDescc ${M YUB) $(TYPES) $(DICT) $(DIFF) 

129 gcc $(CCFLAGS) recogDescc 
130 

131 resample.o: resample.c$(BOOLEAN) $(TYPES) $(ERROR) $(DIO) 

132 gcc$(CCFLAGS) resample.c 
133 

134 single.o: single.c $(MYUB) $(TYPE5) ${DICT) $(DIFF) $(DIFF2) $(MATCH) 

$(M ATCH P ARALLE L) 

135 gcc $(CCFLAGS) single.c 
136 

1 37 sortMatrix.o: sortMatrix.c $(ERROR) $(PICT) 

138 gcc $(CCFLAGS) sortMatrix.c 



10/24/2003, EAST version: 1.4.1 



5,491,760 
Section C 



128 

APPENDIX / Page 45 



Jul 9 19:361991 anomalies.c 



1 ^include <stdio.h> 

2 #include "error.h" 

3 #include "types.h M 

4 #include "pJa.h" 

5 #include - dict.h" 

6 #include "diff.h M 

7 #include "much* 
8 

9 #define MAX_STRING_LEN (100) 

10 #define MAX_DICTIONARIES (15) 

1 1 #def ine MAX.WORDS (100) 

12 #define MAX ENTRIES (MAX WORDS* MAX_WORDS) 
13 

14 typedef struct { 

15 float score; 

16 intx; 

17 inty; 

18 ) *CompareTuple,CompareTupleBody; 
19 

20 

21 intReadlnt(FILE*fp) 

22 { 

23 char s[MAX_STRING_LEN]; 

24 intx; 
25 

26 fgets(s,MAX_STRING LEN/fp); 

27 while (sscanf(s,"%d",&x)! = D 

28 fprintf(stderr, tt Readlnt: integer expected - reenterAn"); 

29 return x; 

30 ) 
31 

32 char *ReadString(FILE *fp) 

33 { 

34 char s[MAX_STRING_LEN]; 

35 char*endPtr; 
36 

37 fgets(s,MAX_STR!NG_LEN J fp); 

38 endPtr = strchr(s/\nO; 

39 if (endPtr NULL) 

40 *endPtr = '\0'; 

41 return strdup(s); 

42 } 
43 

44 int TupleLessThan{CompareTuple *x,CompareTuple *y) 

45 { 

46 if ((*x)->score = = (*y)->score) 

47 return 0; 

48 else if ((*x)-> score < (*y)-> score) 

49 return -1; 

50 else 

51 return 1; 

52 } 
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53 

54 intCountAnomalies(Dictionaryd1, Dictionary d2,char*dName1,char *dName2,char 
**words.FlLE *outfp) 

55 { 

56 CompareTupIeBody scoreBodies[MAX_ENTRIES]; 

57 CompareTuple scores[MAX_ENTRIES];~ 

58 Picture pict; 

59 intx,y,ij; 

60 Int anomalies; 
61 

62 pict = CompareDictionaries(d1 l d2); 
63 

64 for(y=0,i = 0;y<pict->height; + +y) 

65 for (x = 0; x< pict- > width ; + + x) { 

66 CompareTuple temp; 

67 /* temp = (CompareTuple)calloc(1,sizeof (CompareTupIeBody)); 

68 if (temp = = NULL) 

69 DoError( tt %s: cannot allocate space.\n M l argv[0]); 

70 */ 

71 temp = scoreBodies+ i; 

72 temp->score = *((float *)(pict->data)+x+y*pict-> width); 

73 temp->x = x; 

74 temp->y = y; 

75 scores[i] = temp; 

76 + + i; 

77 } 

78 qsort(scores,i,sizeof(CompareTuple)JupleLessThan); 

79 for(j = 0,anomalies=0;j<d1->numberOfEntries;+ +j) 

80 if (scores[j]->x!= scores[j]->y) { 

81 f printf(outf p, u %s: % s %s: % s\n M ,dName1 ,words[scores[j]- > xj f 

82 dName2,words[scores[j]->y]); 

83 ++ anomalies; 

84 } 
85 

86 freej)ict(pict); 

87 return anomalies; 

88 } 
89 

90 void main(int argcchar **argv) 

91 { 

92 char*outFile,*listFile; 

93 int numberOfDictionaries; 

94 Dictionary dictionariesIMAX DICTIONARIES]; 

95 char*namesIMAXJ)iaiONARIES]; 

96 char *wordslMAX_WORDS]; 

97 int numberOfWords; 

98 FILE *listfp,*outfp; 

99 int anomaliesIMAX_DICTlONARlESl[MAX_DICTlONARIES]; 

100 inti ( x,y; 
101 

102 if (argc I = 3) 

103 DoError(" Usage: %slistfileoutfile.\n\argv[0]); 

104 listFile = argv[1]; 

105 outFile = argv(2]; 
106 



10/24/2003, EAST version: 1.4.1 



5,491,760 

131 132 



Section C APPENDIX / Page 47 



107 if ((listfp = fopen(listFi!e,V')) = = NULL) 

108 DoError( M Error opening file %s.\nMistFile); 
109 

110 /* Read in the number of words in each dictionary */ 

1 1 1 numberOfWords = Readlnt(listfp); 

112 if (numberOfWords > MAXJA/ORDS) 

1 13 DoError(*%s: too many words.\n\argv[0]); 
114 

115 /* Read in the words*/ 

116 f or (i = 0; i < numberOfWords; + + i) { 

1 17 wordsfl] = ReadStri ng (listfp); 

118 } 
119 

120 /* Read in the number of dictionaries */ 

121 numberOfDictionaries = Readlnt(listfp); 

1 22 if (numberOfDictionaries > MAX_D I CTIO N ARI ES) 

123 DoError( M %$; too many dicitionaries.\n w ,argv[0]); 
124 

1 25 /* Read in the dictionaries and their names */ 

1 26 for (i = 0; i< numberOfDictionaries; + + i) { 

127 namesli] — ReadString(listfp); 

128 dictionaries^] = ReadDictionary(names[i]); 

129 } 
130 

131 /* Check to see that all dictionaries have the same number of shapes as the specified number 
of words. */ 

132 f or (i = 1 ; i < numberOfDictionaries; + + i) 

133 if (dictionaries[il->numberOf Entries != numberOfWords) 

134 DoError(" Dictionary %s has wrong number of entries.\n",names[i]); 
135 

136 /* Write the results*/ 

137 if ((outfp = fopen(outFile,"w")) = =NULL) 

138 DoError( M Error opening %sfor output.\n" i outFile); 

139 fprintf(outfp, H Words:\n M ); 

140 for (i = 0;i< numberOfWords; + + i) 

141 fprintf(outfp,"%d: %s\n M ,i,words[i]); 

142 fprintf(outfp/\n M ); 

143 fprintf(outfp,"Dictionaries:\n M ); 

144 for(i = 0;i<numberOfDictionaries; + +i) 

145 fprintf(outfp."%d: %s\n*\i,names[i]); 

146 fprintf(outfp, M \n"); 
147 

148 /* Fill in the anomaly counts*/ 

149 for(y=0;y<numberOfDictionaries;+ +y) 

150 for (x = 0;x< numberOfDictionaries; + +x){ 

151 anomalies[y][x] = 
CountAnomaliestdictionarieslyLdictionariesM^nameslyl.nameslxj^ords^outfp); 

152 printfdVod.^/od): %d\n 4 ',x,y,anomalies[y)[xl); 

153 } 
154 

155 fprintf(outfp,"\n\n"); 

156 fprintf(outfp, N "); 

157 for(x = 0; x < numberOfDiaionaries; x+ +) 

158 fprintf(outfp/%7d M , x); 

159 fprintf(outfp, "\n M ); 



10/24/2003, EAST version: 1.4.1 



5,491,760 



134 



Section C APPENDIX / Page 48 



160 for (y=0;y<numberOf Dictionaries; + -hy) { 

161 fprintf{outfp/%3d",y); 

162 for (x=0;x<numberOfDictionaries; + +x) 

163 fprintf(outfp,"%7d ",anomalies[y]{x]); 

164 fprintf(outfp,"\n''); 

165 } 

166 fclose(outfp); 
167 

168 } 
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Jul 31 17:141991 descMain.c 



1 #indude <stdio.h> 

2 #include"mylib.h" 

3 #include "types.h" 

4 #include "dict.h" 

5 #inciude 4, diff.h tt 

6 #include "descriptors.h" 
7 

8 void PrintDescriptors(Dictionary models,char *modelName,char **wordNames, 

9 int numberOfFont^Dictionary fonts[], 

10 char **fontNames,int number Of Words, 

11 Diff Descriptor dd) 

12 { 

13 intmodellndex,fontlndex; 

14 intstarCount,correctCount; 

15 Descriptor thisDescriptor; 

16 int lineCount; 
17 

18 printf("\f\rO; 

19 PrintWords(wordNames,numberOfWords); 

20 lineCount = 0; 

21 starCount = 0; 

22 correctCount = 0; 

23 for{modellndex = 0;modellndex<numberOfWords; + +modellndex){ 

24 printf( w %s%s\n",modelName / wordNames[mod€llndexl); 

25 + + lineCount; 

26 for(fontlndex=0;fontlndex<numberOfFonts; + +fontlndex) { 

27 thisDescriptor = 
ComputeDescriptoKmodellnde^rnodels^fontslforitlndexl^umberOfvVords^d); 

28 printfC "); 

29 PrintField(fontNamesIfontlndex),20); 

30 PrintDescripto^thisDescriptor^&starCount&correctCount); 

31 printfrvn"); 

32 -f + lineCount; 

33 } 

34 if (lineCount>30){ 

35 printfC\f\n M ); 

36 PrintWords(wordNames,numberOfWords); 

37 lineCount = 0; 

38 } 

39 } 

40 fprintf(stdout, "There were %d mismatches "^tarCount- 
numberOfWords*numberOf Fonts); 

41 fprintf(stdout, "better than %d correct matches. (yo6.2f%)\n M ( 

42 numberOfWords*numberOfFonts, 

43 (float)(numberOfWords*numberOfFonts)/(fioat)starCount); 

44 fprintf(stdout, "There were %d correctly matched words out of %d. (%6.2f%)\n", 

45 correctCount # numberOfWords*numberOfFonts l 

46 (float)correctCount/(float)numberOfWords/numberOfFonts); 

47 } 
48 

49 void main(int argcchar **argv) 

50 { 
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51 char*listFile; 

52 Dictionary models; 

53 char*modelName; 

54 int numberOf Fonts; 

55 Dictionary fonts[MAX_FONTS]; 

56 char *fontNames[MAX FONTS]; 

57 char *wordName$[MAX_WORDS]; 

58 int numberOfWords; 

59 float centerWeight; 

60 int norm a IB and Width; 

61 BOOLEAN 

lengthNormalize^useLZ^iopeConstrain^warp^opToBottomOption^illToValleyOption; 

62 BOOLEAN separate; 

63 float topToBottom,hillToValleyLocal; 

64 FILE *listfp; 

65 inti,x,y; 

66 DiffDescriptorBody dd; 
67 

68 centerWeight = 1.0; 

69 normalBandWidth ~ 20; 

70 topToBottom = 1.0; 

71 hillToValleyLocal = 1.0; 

72 DefArgr%sVlistFile".&li$tFile); 

73 DefOption( a -L2V-L2",&useL2); 

74 DefOption("-slopeConstrajn %f" " -si opeCon strain < center weight >\ 

75 &slopeConstrain,&centerWeight); 

76 DefOptionC-warp %i %d V'-warp < center weight > <band width>", 

77 &warp,&centerWeight,&normalBandWidth); 

78 DefOptionC-separate", "-separate *\&separate); 

79 DefOption( ,, -normalize ,, / , -normalize ,, J &lengthNormalize); 

80 DefOption<"-topToBottom %f V -to pTo Bottom 
<ratio>%&topToBottomOption,&topToBottom); 

81 DefOption("-hillToValley %f Y'-hilIToValley 
<ratio>" / &hillToValleyOption ( &hillToValleyLocal); 

82 ScanArgs(argc,argv); 
83 

84 if ((Hstfp = fopen(listFile,V*)) = =NULL) 

85 DoError(*Error opening file %s.\n M JistFile); 
86 

87 /* Read in the number of words in each dictionary */ 

88 numberOfWords = Readlnt(listfp); 

89 if (numberOfWords > MAX_WORDS) 

90 DoError( H %s: too many wordsAn'\argv[0]); 
91 

92 /* Read in the words */ 

93 for (i= 0; i < n umberOfWords; + + i) { 

94 wordNames[i] = ReadString(listfp); 

95 } 
96 

97 /* Read in the model dictionary */ 

98 modelName = ReadString(listfp); 

99 models = ReadDictionary(modelName); 
100 

101 /* Read in the number of dictionaries */ 

102 numberOf Fonts = Readlnt(listfp); 
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103 if (numberOf Fonts > MAX JONTS) 

104 DoError( u %s: too many dictionariesAn'\argv[0]); 
105 

1 06 /* Plead in the dictionaries and their names */ 

1 07 for (i = 0; i < numberOf Fonts; + + i) { 

108 fontNamesli] = ReadString(listfp); 

109 fonts[i) = ReadDictionary(fontNames(i)); 

110 } 
111 

112 /* Check to see that ail dictionaries have the same number of shapes as the specified number 
of words. */ 

113 for (i= 1;i< numberOf Fonts; + + i) 

1 14 if (fonts!i]-> numberOf Entries < numberOfWords) 

1 15 DoError("Dictionary %s has too few entriesAn'\fontNames[i]); 

116 if (models- > numberOf Entries < numberOfWords) 

117 DoErrorC Model dictionary has too few of entriesAn",NULL); 
118 

119 

120 if(useL2){ 

121 fprintf(stdout, "Using L2 on length normalized shapesAn"); 

122 dd.diffType = L2; 

123 } 

1 24 else if (slopeConstrain) { 

125 fprintf(stdout, "Using dynamic time warping with slope contrained to [0.5,2] An "); 

1 26 dd.diffType = CONSTRAINED; 

127 dd.separate = separate; 

128 if (separate) 

129 fprintf($tdout,"Topand bottom warped separatelyAn"); 

130 else 

131 fprintf(stdout, w Top and bottom warped togetherAn"); 

132 } 

133 else{ 

134 fprintf(stdout,"Using dynamictime warping with bandwidth %d.\n\normalBandWidth); 

135 dd.diffType = WARP; 

136 dd.bandWidth = normalBandWidth; 

137 dd.separate = separate; 

138 if (separate) 

1 39 fprintf(stdout,"Top and bottom warped separatelyAn**); 

140 else 

141 fprintf(stdout,"Top and bottom warped togetherAn"); 

142 } 

143 if(iuseL2){ 

144 fprintf(std out, "Center weight = %fAn",centerWeight); 

145 dd.centerWeight = centerWeight; 

146 if (iengthNormalize) { 

147 dd.lengthNormalize = TRUE; 

148 fprintf(stdout," Scores normalized by signal length A n M ); 
N 149 } 

150 else 

151 dd. IengthNormalize = FALSE; 

152 } 

153 dd.hiHToValley = hillToValleylocal; 

154 dd.topToBottom = topToBottom; 

155 dd.pathFP = NULL; 
156 
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157 fprintf{stdout/Words:\n"); 

1 58 for (i= 0; i < numberOfWords; + + i) 

159 fprintf(stdout,*%d: %s\n M ,i,wordNames[iJ); 

160 fprintf(stdout, a \n w ); 

161 fprintf{stdout," Model font is %s.\n° ( modelName); 

1 62 fprintf{stdout,*Fonts:VrO; 

163 for (i =0;i<numberOf Fonts; + + 

164 fprintf{stdout/'%d:%s\n M ,i ( fontNamesli]); 

165 fprintf(stdout,"\n tt ); 
166 

167 

PrintDescriptors(models,modelName,wordN 
CrWords.&dd); 

168 } 
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1 #include <stdio.h> 

2 #include"mylib.h" 

3 . #include "types.h" 

4 #include "dicth" 

5 #include "difth" 

6 #include "misc.h" 

7 #include "descriptors.rT 
8 

9 typedef struct { 

10 float score; 

11 intword; 

12 } *CompareTuple,C6mpareTupleBody; 
13 

14 intTupleLes$Than(CompareTuplex,CompareTupley) 

15 { 

16 if (x->score - = y->score) 

17 return 0; 

18 else if (x->score < y->score) 

19 return -1; 

20 else 

21 return 1; 

22 } 
23 

24 int CompareDescriptorElements{Descriptor x,Descriptor y) 

25 { 

26 if(* x ==*y) 

27 return 0; 

28 elseif(*x< *y) 

29 return -1; 

30 else 

31 return 1; 

32 } 
33 

34 Descriptor ComputeDescriptor(int modellndex,Dictionary models,Dictionary thisFontJnt 
numberOfWords, 

35 DiffDescriptordd) 

36 { 

37 DescriptorElement descriptor[MAX_WORDS+ 1); 

38 CompareTupleBody resulU(MAX_WORDS); 

39 int i; 
40 

41 for (i = 0;i< numberOfWords;* + i){ 

42 results[i].score = 
DiffPair(*(models->outlines+modellndex)/(thisFont->outlines+i),dd); 

43 results[i].word = i; 

44 } 

45 qsort(results t thisFont->numberOfEntries,sizeof(CompareTupleBody)JupleLessThan); 

46 f or (i = 0; i < numberOfWords; + + i) { 

47 descriptor!!] = results[iJ.word + 1 ; /* Descriptor values are one greater than word indices 
*/ 

48 if (results[i].word = = modellndex) { 

49 ++i; 
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50 break; 

51 } 

52 } 

53 descriptor!!] = # \0'; 

54 qsorttdescriptorj^izeoftDescriptorElementXCompareDescriptorElements); 

55 return (Descriptor)$trdup((char ^descriptor); 

56 } 
57 

58 void PrintFie1d(char *s f int w) 

59 { 

60 int ij; 

61 printf("%s",s); 

62 I = w-strlen(s); 

63 for(i=0;i<l; + + i) 

64 printfr ■); 

65 } 
66 

67 void PrintDescriptor(Descriptor d,int *starCount,int *correctCount) 

68 { 

69 int i = 1; /* Descriptor values are one greater than word indices */ 

70 int temp; 

71 temp = *starCount; 

72 if(*d=='\ 0 '){ 

73 printf("* u ); 

74 + + *starCount; 

75 } 

76 while (*d!='\0'){ 

77 while (i-n-<*d) 

78 printfC 

79 printfC*"); 

80 + + *starCount; 

81 d++; 

82 } 

83 if (*starCount-temp = = 1) 

84 + +*correctCount; 

85 } 
86 

87 void PrintWords(char **words,int numberOfWords) 

88 { 

89 int lengths[MAX_WORDS]; 

90 int i,j; 

91 int maxlength = 0; 
92 

93 maxLength = 0; 

94 for (i=0;i<numberOfWords; + + i) { 

95 lengths[i] = strlen(words[iJ); 

96 if (lengths(ij > maxLength) 

97 maxLength = lengths[i|; 

98 } 
99 

1 00 for (j =0; j < maxLength; + + j) { 

101 printfC "); 

1 02 for (i = 0; i < numberOfWords; + + i) 

103 if (j<lengths[i» 

104 printft^/ocVCwordsfiJ+j)); 
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105 else 

106 printf(" "); 

107 printfOn"); 

108 } 

109 } 
110 

111 
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1 #include <stdio.h> 

2 #inciude"boolean.h" 

3 #indude M types.h" 

4 #include "error.h" 

5 #include "picth" 

6 #include "dict.h M 

7 #indude M diff.h M 

8 #indude "newMatch.h" 
9 

10 

1 1 extern double fabs(double); 
12 

13 /* Dynamic programming version of DiffPair */ 

14 inline float Diff Pair(OutlinePair one, OutlinePair two, 

15 Diff Descriptor dd) 

16 { 

17 hillToValley = dd-> hillToValley; 

18 if ((dd->separate)&&(dd->pathFP! = NULL)) 

19 DoErrorC'DiffPair: separate cannot be used with pathfile option. \n\NULL); 

20 switch (dd->diffType) { 

21 case CONSTRAINED: 

22 if (dd->pathFP != NULL) 

23 return SlopeCMatchAndPath(one->top,one->bottomiOne->numberOfLegs, 

24 two-> top,two- > bottorn,two- > numberOf Legs, 

25 dd^centerWeightdd^lengthNormalizcdd^topToBottom, 

26 dd->pathFP); 

27 else 

28 if (dd->separate) 

29 return 
SepSlopeCMatch(one->top,one->numberOfLegs,two->top J two->numberOfLegs # 

30 dd->centerWeight,dd-> lengthNormalize)*dd->topToBottom + 
31 

SepSlopeCMatch(one->bottom,one->numberOfLegs,two->bottom,two-> number 
OfLegs, 

32 dd->centerWeight,dd->lengthNormalize); 

33 else 

34 return SlopeCMatch(one->top,one-> bottom, one- >numberOf Legs, 

35 two->top # two->bottom,two->numberOfLegs, 

36 dd^centerWeight.dd^lengthNormalize^dd^topToBottom); 

37 break; 

38 case L2; 

39 if(dd->pathFP! = NULL) 

40 DoErrorC'DiffPair: L2 does not support path computationAn'\NULL); 

41 else 

42 return L2Compare(one,two,dd->topToBottom); 

43 break; 

44 case WARP: 

. 45 if (dd->pathFP 1 = NULL) 

46 return NewMatchAndPath(one->top,one-> bottom,one-> numberOfLegs, 

47 two- > top,two-> bottom, two- > nu mberOf Legs, 

48 dd- > centerWeig ht,dd- > lengthNormalize,dd- > ba ndWidth, 

49 dd->topToBottom, 
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50 dd->pathFP); 

51 else 

52 if (dd-> separate) 

53 return SepMatchtone^top^one^numberOfLegs.two^top.two^numberOf Legs, 



54 



55 



dd->centerWeight / ddr>lengthNorma!ize l dd->bandWidth)*dd->topToBotto 
m -f 



SepMatchCone^bottom.one^numberOfLegs.two^bottom.two^nurnberOfLegs, 

56 dd^centerWeightdd^lengthNormalize.dd^ bandwidth); 

57 else 

58 return 
NewMatch(one->top f one->bottom # one->numberOfLegs / two->top,two-> bottom, 
two- > numberOf Legs, 

59 dd->centerWeight,dd->lengthNormalize,dd-> bandwidth, 

60 dd->topToBottom); 

61 break; 

62 default: 

63 DoError("DiffPair: internal error.\n",NULL); 

64 } 

65 } 
66 

67 #ifdeffoo 

68 inline float DiffPairAndPath(OutlinePair one, OutlinePair two, 

69 float centerWeight,BOOLEAN lengthNormalize,int 

normalBand Width, 

70 char *filename,BOOLEAN doPath) 

71 { 

72 FILE *fp; 

73 float score; 

74 if ((f p = f open(f i lename, M w ")) = = NULL) 

75 DoError("DiffPairAndMatch: error opening output file % s.\n" filename); 

76 score = NewMatchAndPath{one->top,one->bottom,one->nurnberOfLegs, 

77 two- > top, two- > bottom, two- > numberOf Legs, 

78 centerWeightJengthNormalize,normalBandWidth, 

79 fp,doPath); 

80 fclose(fp); 

81 return score; 

82 } 

83 #endif 
84 

85 BOOLEAN lsSymmetric(Picture pict) 

86 { 

87 intx,y; 

88 float maxDiff = 0; 

89 f or (y = 0; y < pict- > height; + + y) 

90 for (x = 0; x < pict- > width; + + x) { 

91 floattemp = fabs ('((float *)(pict-> data) + pict- > width*y + x) - 

92 *((float *)(pict->data) + pict->width*x + y)); 

93 if (temp > maxDiff) 

94 maxDiff = temp; 

95 } 

96 fprintf(stderr, "maxDiff = %f.\n maxDiff); 

97 if (maxDiff > 0.01) 

98 return FALSE; 
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99 return TRUE; . 

100 } 
101 

102 /* Given the names of two dictionary files, compute the squared difference 

103 * between every pair of shapes in the cross product of the dictionaries. 

1 04 * The result is a matrix printed to stdout. The width and height are 

105 * followed by the matrix entries in row major order. The output is in 

106 * ascii to facilitate reading by a Symbolics. */ 

107 Picture CompareDictionaries(Dictionary dictl ( Dictionary dict2,Diff Descriptor dd) 

108 { 

109 Picture pict; 

110 intx.y; 

111 pict = new_pict(dict2->numberOf Entries, 

112 dictl- >numberOf Entries, 

113 32); 
114 

115 for(y=0;y<pict->height; + +y) 

116 for(x = 0;x<pict->width;++x){ 

117 /* for output files when printing and match */ 

1 18 printf("~-> (%d,%d) <~-\n",y,x); 

119 *((float*)(pict->data) + pict->width*y+x) = 

1 20 Diff Pair(*(dict 1 -> outlines + y), 

121 * (dict2- > outlines* x), 

122 dd); 

123 } 

1 24 if (MsSymmetric(pict)) 

125 fprintf(stderr f "Matrix is not symmetricAn"); 

1 26 return pict; 

127 } 
128 

129 void WritePictureAsAscii(Picture pict,char *filename, 

130 char *info1, char *info2) 

131 { 

132 FILE *fp; 

133 intx,y; 

134 int count; 
135 

136 if ((fp = fopen{filename/w")) = = NULL) 

137 DoError("WritePictureAsAscii: error opening output file\n M ,NULL); 

138 fprintf(fp,"%s\n",info1); 

139 fprintf(fp f tt %s\n",info2); 

140 fprintf(fp,"#\n"); 

141 fprintf(fp f " 0 /od\n°/od\n",pict->width,pict->height); 

142 fprintf(fp/'%3s", MM ); 

143 for(x = 0; x < pict->width; x++) 

144 fprintf(fp, w %7d ,J f x); 

145 fprintf(fp f "\n"); 

146 for (y=0;y<pict->height; + +y) { 

147 fprintf (fp, " %3d M ,y); 

148 count =1; 

149 for (x= 0;x < pi ct-> width; + +x) { 

150 fprintf(fp,"%7.3f V (({float *)pict-> data) + +)); 

151 /* if ((pict- > width > 10) &&(!((count++)%10))) 

152 fprintf(fp,"\n 1 '); 

153 */ } 
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154 fprintf(fp, "\n"); 

155 } 

156 fclose(fp); 

157 } 



10/24/2003, EAST Version: 1.4.1 



5,491,760 

157 158 



Section C APPENDIX / Page 60 

Jul 22 15:21 1991 dmain.c 



1 #include <stdio.h> 

2 #include <math.h> 

3 #include <values.h> 

4 #include "boolean.h" 

5 #include u types.h M 

6 #include u pict.h tt 

7 #include "dict.h" 

8 #include 4, diff.h 1 ' 

9 #indude a match.h M 

10 #include "matchparallel.h" 
11 

12 

13 void main(int argc,char**argv) 

14 { 

15 Picture pict; 

16 char*infile1,*infiIe2,*outfile,*format; 

17 Dictionary dict1,dict2; 
18 

19 if(argc!=5){ 

20 printf( tt Usage:\n M ); 

21 printf(" %s infilel infile2 outfilefofmat\n tt ,argvlO]); 

22 printf (" where format is either ascii or pict\n"); 

23 exit(-1); 

24 } 
25 

26 infilel = argv(1]; 

27 infile2 = argv[2]; 

28 outfile = argv[3]; 

29 format = argv[4]; 
30 

31 dictl =: ReadDictionary(infilel); 

32 dict2 = ReadDictionary(inf ile2); 

33 pict = CompareDictionariesfdict^di^JJRUE^O.FALSE); 

34 if (!strcmp(format, M pict")) 

35 write pict(outfile,pict); 

36 else 

37 WritePirtureAsAscii(pict r outfile,dict1->infoString # dict2->infoString); 



38 } 
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1 #include <stdio.h> 

2 #indude "mylib.h" 

3 #include "types.h" 

4 #include°dict.h" 
5 

6 #define WIDTH (800) 

7 #define HJvlARGIN (20) 

8 #defineV_MARGIN(60) 

9 #define HJPACING (20) 

10 #def ine V.SPACING (1 50) /* Must be greater than 2*X_H EIGHT */ 

11 #defineX HEIGHT (17) 
12 

13 extern int irint(double); 
14 

15 void DrawVLine(Picture pict,intx,int yt,int yb) 

16 { 

17 int i; 

18 for(i = yt;i<yb; + +i) 

19 WritePixel(pict,x,i,1); 

20 } 
21 

22 void DrawOutline(Picture pict,OutlinePair ojnt x,int y) 

23 { 

24 int i, top, bottom; 

25 for (i = 0;i<o-> numberOf Legs; + +i) { 

26 top = irint(-*(o->top + i)*X_HEIGHT); 

27 bottom = irint(*(o->bottom + i)*X_HEIGHT+X_HEIGHT); 

28 DrawVLine(pict,i-f x,top + y,bottom+y); 

29 } 

30 } 
31 

32 int main(int argcchar **argv) 

33 { 

34 char *infile,*outfile; 

35 Dictionary diet; 

36 Picture pict; 

37 int i,totalLegs,totalLines; 

38 intx,y,newX; 
39 

40 DefArgf %s %s", "infile outfile*\&infile,&outfile); 

41 ScanArgs(argcargv); 
42 

43 diet = ReadDictionary(infile); 
44 

45 for (i = 0,totalLegs = H_MARGIN,totalLines = V_ MARGIN; i < diet- > numberOf Entries; + 4- i) { 

46 OutlinePairthisOutline = *(dict-> outlines + i); 

47 totalLegs + = thisOutline-> numberOf Legs + H SPACING; 

48 if (totalLegs > WIDTH) { 

49 totalLines + = VJPACING; 

50 totalLegs = HJvlARGIN + thisOutline->numberOfLegs + H SPACING; 

51 if (totalLegs >"WIDTH) 
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52 DoError("%s: Shape is too wideArT^rgvlO]); 

53 } 

54 } 
55 

56 pict = new_pict(WIDTH f totalLines+V MARGIN*2 ( 1); 
57 

58 for(i = 0 J x=H_MARGIN l y=V_MARGIN;i<dict->numberOfEntries; + + i){ 

59 OutlinePairthisOutline = *(dict-> outlines* I); 
60 

61 newX = x + thisOutline->numberOfl_egs + H SPACING; 

62 if (newX > WIDTH) { 

63 newX = H_MARGIN+thisOutline->numberOfLegs-f H SPACING; 

64 x = H_MARGIN; 

65 y + = V SPACING; 

66 } 
67 

68 DrawOutline(pict,*(dict->oiJtlines+i) # x,y); 
69 

70 x = newX; 

71 } 
72 

73 write pict(outfile.pict); 

74 } 
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Jul 26 16:47 1991 equiv.c 

1 #indude <stdio.h> 

2 #include"mylib.h N 

3 #include"types.h M 

4 #include"dict.h M 

5 #include "diff.h" 

6 #indude "descriptors.h" 
7 

8 voidPr[ntEquivalenceClasses(intnumberOfWords ( char**wordNames ( 

9 int numberOfFonts,Dictionary fonts[],char **fontNames, 

10 Diff Descriptor dd) 

11 { 

12 DescriptordescriptorslMAX_FONTS*MAX_FONTS]; 

13 int matchesWord[MAX.WORDS]; 

14 int word,font1,font2,i; 

15 inttotalDifferent,totalWords; 

16 int numberOf Pairs; 
17 

18 for (word =0; word < n umberOfWords; + + word) { 

19 for (f ontl = O.numberOf Pairs=0;f ontl <numberOf Fonts; + +font1) 

20 for{font2=0;font2<font1; + +font2) 

21 descriptorslnumberOfPairs+ +] = ComputeDescriptor(word,fontsIfont1J, 

22 fonts[font2],numberOfWords,dd); 

23 for (i = 0; i < numberOfWords; + + i) 

24 matchesWordli] = 0; 

25 for (i = 0; i <numberOfPairs; + + 1) { 

26 Descriptor p; 

27 p = descriptors!*]; 

28 while (*p! = '\0') 

29 matchesWord[*p+ + - 1J + + ; 

30 } 

31 f or (i = 0,tota [Different = 0,tota!Words = 0; i < numberOfWords; + + i) { 

32 if (matchesWordli)) 

33 ++tota!Different; 

34 totalWords + = matchesWordli]; 

35 } 

36 printf( tt %20s:\t\t%6d %6.2f %6d %6.2f\n" # wordNames[word],total Words, 

37 (floatJtotalWords/numberOfPairs.totalDifferent, 

38 (float)totalDifferent/(float)totalWords*numberOfPairs); 

39 fprintf(stderr,"%20s:\t\t%6d %6.2f %6d %6.2f\n M / wordNames[word] ( totalWords, 

40 (float)totalWords/numberOfPairs,totalDifferent, 

41 (float)totalDifferent/(float)totalWords*numberOfPairs); 

42 } 

43 } 
44 

45 

46 void main(intargc,char **argv) 

47 { 

48 char *listFile; 

49 int numberOfFonts; 

50 Dictionary fonts[MAX_FONTS]; 

51 char *fontNames[MAX_FONTS]; 

52 char *wordNameslMAX_WORDS]; 
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53 int numberOfWords; 

54 float centerWeight; 

55 int normalBandWidth; 

56 BOOLEAN 

lengthNormalize,useL2,s!opeCon5train # warp,topToBottomOption # hillToVaileyOption; 

57 float topToBottom,hillToVaIleyLocal; 

58 FILE *listfp; 

59 int i,x,y; 

60 DiffDescriptorBody dd; 
61 

62 centerWeight = 1.0; 

63 normalBandWidth = 20; 

64 topToBottom = 1.0; 

65 hHIToValleyLocal = 1.0; 

66 DefArgC%s-/listFile M ,&listFile); 

67 DefOption("-L2 ,, /-L2 B ,&useL2); 

68 DefOption("-slopeConstrain %fV-slopeConstrain <center weight> 

69 &slopeConstrain,&centerWeight); 

70 DefOption{ w -warp %i %d","-warp <center weight> <band width> u , 

71 &warp,&centerWeight,&normalBandWidth); 

72 DefOption("-normalize w , w -normali2e" 1 &lengthNormalize); 

73 DefOption{MopToBottom%fV-topToBottom 
< ratio >".&topToBottomOption,&topToBottom); 

74 DefOption("-hillToVa!ley %fV-hil!ToVaUey 
<ratio> M ,&hillToValleyOption,&hillToValleyLocal); 

75 ScanArgs(argcargv); 
76 

77 if ((listfp = fopen(listFile,V))==NULL) 

78 DoError("Error opening file %s.\nMistFiie); 
79 

80 /* Read in the number of words in each dictionary *t 

81 numberOfWords = Read I nt( listfp); 

82 if (numberOfWords > MAXJ/VORDS) 

83 DoError( tt %s: too many words.\n M ,argv{0]); 
84 

85 /* Read in the words */ 

86 for (i = 0; i < numberOfWords; + + 1) { 

87 wordNames[iJ = ReadString(listfp); 

88 } 
89 

90 /* Read in the number of dictionaries */ 

91 numberOfFonts = Readlnt(listfp); 

92 if (numberOfFonts > MAXJONTS) 

93 DoErrorf %s: too many dictionaries.\n t, ,argvlO]); 
94 

95 /* Read in the dictionaries and their names */ 

96 for (i = 0; i < nu mberOf Fonts; + + i) { 

97 fontNamesfi] = ReadString(listfp); 

98 fonts[i] = ReadDictionary(fontNames[i]); 

99 } 
100 

101 /* Check to see that all dictionaries have the same number of shapes as the specified number 
of words. */ 

1 02 f or (i = 1 ; i < n u m berOf Fonts; + + i) 

103 if (fonts[i]->numberOf Entries < numberOfWords) 



10/24/2003, EAST version: 1.4.1 



5,491,760 

167 168 



Section C APPENDIX / Page 65 

104 DoError( H Dictionary%shastoofewentries.\n ,, ,fontNames[i]); 
105 

106 if(useL2){ 

107 printff Using L2on length normalized shapes.\n M ); 

108 dd.diffType = L2; 

109 } 

110 else if (slopeConstrain) { 

1 1 1 printf (" Using dynamic time warping with slope contrained to [0.5,2].\n"); 

112 dd.diffType = CONSTRAINED; 

113 } 

114 else{ 

1 15 printfC Using dynamic time warping with bandwidth %dAn ,, f normalBandWidth); 

116 dd.diffType = WARP; 

117 dd.bandWidth = normalBandWidth; 

118 } 

119 if(!useL2){ 

120 printf("Center weight - %f.\n*\centerWeight); 

121 dd.centerWeight = centerWeight; 

1 22 if (lengthNormalize) { 

1 23 dd.lengthNormalize = TRUE; 

124 printf( "Scores normalized by signal lengthen"); 

125 } 

126 else 

127 dd.lengthNormalize = FALSE; 

128 } 

129 dd.hillToValley = hillToValleyLocal; 

130 dd.topTo Bottom = topToBottom; 

131 dd.pathFP = NULL; 
132 

133 printfCFonts:W); 

1 34 f or (i - 0; i < numberOf Fonts; + + i) 

135 printf("%d: %s\nM,fontNames[i]); 

136 . printf("Vn"); 
137 

138 

PrintEquivalenceClasseslnumberOfWords^ordNames.numberOfFont^fonts^fontNames^d 

d); 
139 } 
140 
141 
142 
143 
144 
145 
146 
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Jul 3 14:31 1991 extract.c 



1 #include <stdio.h> 

2 #include <math.h> 

3 #include <values.h> 

4 #include "boolean.h" 

5 #include"types.h B 

6 #include"dict.h" 
7 

8 #define MAX_STRING_LEN 100 

9 int Readlnt(FILE *f p) 

10 { 

1 1 char slMAX.STRINGJ.EN]; 

12 intx; 
13 

14 fgets(s,MAX STRING LEN,fp); 

15 while (sscanf(s,"%d'\&x)! = 1) 

16 fprintf(stderr, M Readlnt: integer expected - reenterAn"); 

17 return x; 

18 } 
19 

20 void main(intargc,char**argv) 

21 { 

22 char *infiie,*listFile,*outfi!e; 

23 Dictionary dict1,dict2; 

24 int i; 

25 int numberOfEntries; 

26 FILE *fp; 
27 

28 if (argc ! = 4) { 

29 printfCUsageAn"); 

30 printf(" %s inf ile listfile outfile\n u f argv[0]); 

31 exit(-l); 

32 } 
33 

34 infile = argv[1]; 

35 listFile = argvI2j; 

36 outfile = argvl3]; 
37 

38 dictl = ReadDictionary(infile); 
39 

40 if {(f p = f open(listFi le/' r ")) = = NULL) 

41 DoError("%s: error reading list fileAn M ,argv[0]); 
42 

43 numberOfEntries = Readlnt(fp); 

44 if (numberOfEntries < 0) 

45 DoError("%s: list file must have a positive number of elements.\n",argvl0]); 

46 printf( M Copying °/od shapes.\n" f numberOf Entries); 
47 

48 dict2 = NewDict(numberOfEntries); 
49 

50 dict2->infoString = dict1->infoString; 

51 for (i=0;i< numberOfEntries; + + i) { 

52 int shape; 
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53 shape = Readlnt(fp); 

54 if ((shape <0)||(shape> -dict1->numberOf Entries)) 

55 DoError("%s: bad shape indexAn\argv[Ol); 

56 *(dict2->outlines+i) = *(dictl->outlines+shape); 

57 *(dict2->rawOutlines+i) = *(dict1->rawOutlines+shape); 

58 } 

59 fclose(fp); 

60 WriteDictionary(dict2,outfile); 
61 

62 } 
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Jun 14 16:12 1991 l2Norm.c 



1 #include <stdio.h> 

2 #include <values.h> 

3 #include<string.h> 

4 #include "boolean.h" 

5 #include w types.h" 

6 # include "error.h" 

7 #indude "dicth" 
8 

9 float L2Norm(OutlinePair signal, intstartOffset, 

10 OutlinePair model) 

11 { 

12 float *top1,*top2,*bottom1,*bottom2; 

13 int i,overlap; 

14 float sum; 

15 float temp; 
16 

17 if ((startOffset < 0) || 

18 (startOffset + modeI->numberOfLegs > signal->numberOfl_egs)) 

19 DoError("L2Norm: the model must overlap the signalAn w ,NULL); 

21 topi = signal->top+startOffset; 

22 top2 = model->top; 

23 bottoml = signal->bottom+startOffset; 

24 bottom2 = model- > bottom; 
25 

26 overlap = signal->numberOfLegs - startOffset; 

27 if (overlap > model- >numberOf legs) 

28 overlap = model->numberOfLegs; 
29 

30 f or (i = 0,sum = 0; i < overlap; + + i) { 

31 temp = *top1 + + - *top2+ +; 

32 sum + = temp * temp; 

33 temp = *bottom1 + + - *bottom2+ +; 

34 sum + = temp * temp; 

35 } 
36 

37 return sum; 

38 } 
39 

40 OutlinePair LookupShape(char c, Dictionary models) 

42 /* dictionary file has the following order; 

43 ABCDEFGHIJKLMNOPQRSTUVWXYZ 

44 abcdefghijktmnopqrstuvwxyz 

45 0123456789 
46 

47 */ 

48 intshapelndex; 

49 if((c>='a'&&c< = V)) 

50 shapelndex = c-'a'; 

51 elseif(c = =V) 

52 shapelndex = 26; 
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53 elseif(c = =7) 

54 shapelndex = 27; 

55 else 

56 DoError( M LookupShape: have no shape one of the characters.\n",NULL); 

57 return *{models->out!ines+shapelndex); 

58 } 
,59 

60 

61 #define MAX STRING LENGTH 30 

62 #defineMAX_SHIFT10 

63 #def ine MAX_OVERLAP 5 

64 float L2CompareWithString(OutlinePair signal, char *string, 

65 Dictionary models) 

66 { 

67 float * costMatrix; 

68 int*pathMatrix; 

69 int numberOfChars; 

70 int letterlndex, startOffset; 

71 float * cursor; 

72 int left; 

73 int right; 

74 OutlinePair modelShapes[MAX_STRING_LENGTH]; 

75 char*charCursor; 

76 float minValue; 

77 float temp; 

78 int i,oldLeft,oldRight,minlndex;; 
79 

80 /* Make sure input string is not too long. */ 

81 numberOfChars = strlen(string); 

82 if (numberOfChars > = MAX_STRING_LENGTH) 

83 DoError( M L2CompareWithString: string is too long.\n'\NULL); 
84 

85 /* Allocate space for dynamic programming array. */ 

86 /* For now, be a space hog. */ 

87 costMatrix = (float *)calloc(signaJ->numberOfLegs*numberOfChars, 

88 sizeof(float)); 

89 pathMatrix = (int *)calloc(signai->numberOfLegs*numberOfChars J 

90 sizeof(int)); 

91 if ((costMatrix = = NULL)||(pathMatrix = = NULL)) 

92 DoError( H L2CompareWith5tring: cannot allocate space.\n" / NULL); 
93 

94 /* Lookup the shapes corresponding to the characters in the string. */ 

95 charCursor = string; 

96 for (i = 0;i< numberOfChars; + +.i) 

97 modelShapes[i] = L6okupShape(*charCursor+ -f, models); 
98 

99 /* Since the cost matrix is larger than the region containing valid 

1 00 * alignments, first fill in the array with large costs. Later, some 

101 * of these will be overwritten. */ 

102 cursor = costMatrix; 

103 for(i = 0;i<signal->numberOfLegs*numberOfChars; + + i) 

104 *cursor+ + = MAXFLOAT; 
105 

1 06 /* Fill in leftmost column */ 

107 left = 0; 
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108 right = MAX.SHIFT; 

109 for (startOffset=left;startOffset< right; + +startOffset) 

110 if (startOffset + modelShapes(0]->numberOfLegs < = 

111 signal-> numberOf Legs) 

1 12 *(costMatrix+startOffset*numberOfChars) = 

1 13 L2Norm($ignal, startOffset, mode!Shapes[0)); 
114 

115 /* Now do the rest of the columns */ 

1 16 for (letterlndex = 1; letterlndex < numberOf Chars; + + letterlndex) { 

117 oldLeft = left; 

118 oldRight = right; 

1 19 left + = modelShapes[letterlndex-1J->numberOfLegs; 

120 right + = modelShapes[letterlndex-1]->numberOfLeg$ + MAXSH1FT; 

121 for (startOffset = left; startOffset < right; + + startOffset) { 

1 22 if (startOffset + modelShapes[letterlndex]-> numberOf Legs < = 

1 23 signal- > numberOf Legs) { 

1 24 temp = L2Norm(signai, startOffset, modelShapesfletterlndex]); 
125 

126 /*This could be made quite a bit faster since for each start offset, 

1 27 * we just add an element to the set we are minimizing over. */ 

128 minValue = MAX FLOAT; 

129 /* *(costMatrix+oldLeft*numberOfChars+letterlndex-1); */ 

130 minlndex = oldLeft; 

131 f or (i = oldLeft; (i < o!dRight)&&(i < sta rtOf f set); + . + i) { 

132 float temp2; 

133 temp2 = *(costMatrix+i*numberOfChar$+!etterlndex-l); 

134 if (temp2 < minValue) { 

135 minlndex = i; 

136 minValue = temp2; 

137 } 

138 } 

139 *(costMatrix+startOffset*numberOfChars+ letterlndex) = 

140 minValue + temp; 

141 *(pathMatrix+startOffset*numberOfChars+ letterlndex) = 

142 minlndex; 

143 }/*Endofif*/ 
144 

145 }/* for startOffset*/ 

146 }/* for letterlndex*/ 
147 

148 /* Now that all the costs have been filled in, find the cheapest */ 

149 if (right-1+modelShapes(numberOfChars-1)->numberOfLegs+MAX_SHIFT< 

150 signal-> numberOf Legs) 

1 51 /* In this case, the chain of letter shapes does not span the signal. */ 

152 minValue = MAXFLOAT; 

153 else{ 

154 minValue = MAXFLOAT; 

155 minlndex = left; 

156 for(i = left;(i<right)&&(i<signal->numberOfLegs); + +i){ 

157 float temp2; 

158 temp2 = *(costMatrix+i*numberOfChars + numberOfChars-1); 

1 59 if (temp2 < minValue) { 

160 minlndex = i; 

161 minValue = temp2; 

162 } 
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163 } 

164 } 
165 

166 free(costMatrix); 

167 free(pathMatrix); 

168 return min Value; 

169 } 
170 

171 void PrintPath(int *pm, int width, int height, int index) 

172 { 

173 int i; 

174 f or (i = width- 1 ; i > = 0;-i) { 

175 printf("%d M ,index); 

176 index = *(pm + index* width + i); 

177 } 

178 printf("\n H ); 

179 } 
180 

181 void PrintMatrix(float*m,int width, int height) 

182 { 

183 int i; 

184 int j; 

1 85 for (i = 0; i < height; + + i) { 

186 printf("%d: \i); 

187 for(j-0;j<width; + +j) 

188 printf( M %6.2e Vm++); 

189 printf("\n M ); 

190 } 

191 ) 
192 

193 typedef struct CTupIe{ 

194 int index; 

195 float value; 

196 }CompareTupie; 
197 

198 intTupleLes$Than(CompareTuple*t1 J CompareTuple*t2) 

199 { 

200 return t1->value > t2->value; 

201 } 
202 

203 void L2CompareDictToString(Dictionary unknownDict, 

204 char String, 

205 Dictionary modelDict, 

206 BOOLEAN isBatch) 

207 { 

208 CompareTuple *results; 

209 inti; 
210 

211 if ((results = (CompareTuple *)caIloc(unknownDict->numberOf Entries, 

212 sizeof (CompareTuple))) = = 

213 NULL) 

214 DoError("L2CornpareDictToString: cannot allocate spaceAn M ,NULL); 
215 

216 f or (i = 0; i < unknownDict-> numberOf Entries; + + i) { 

217 (*(results+i)).index= i; 
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218 (*(results+i)).value = L2CompareWithString(*(unknownDict->outlines+i), 

219 string, 

220 modelDict); 

221 } 
222 

223 qsort(results, 

224 unknownDict->numberOf Entries, 

225 sizeof(CompareTuple), 

226 TupleLessThan); 
227 

228 if (isBatch) { 

229 printf( ,, %d(%f)\n",(*results).index l (*result5).value); 

230 } 

231 else{ 

232 printfr\n u ); 

233 for (i =0;(i <5)&&(i < unknownDict-> numberOf Entries); + + i) 

234 printf( n %d:%f\n tt / (*(results+i)).index,(*(results + i)).value); 

235 printfCXn 11 ); 

236 } 
237 

238 free(results); 

239 } 
240 

241 void main(int argc,char **argv) 

242 { 

243 char * unknowns/models; 

244 chars[MAX_STR!NG_LENGTH + 1); 

245 Dictionary unknownDict, modelDict; 

246 int selection; 

247 char*crPointer; 

248 BOOLEAN done = FALSE; 

249 BOOLEAN batch; 

250 char *words; 
251 

252 if (argc I = 3 && argc ! = 4) { 

253 printf("Usage:\n B ); 

254 printf( M %s <unknowns> <alphabet> [<batch wordlist>]\n M ( argv(0]); 

255 printf(" If the batch file is not specified, the prog ram runs\n"); 

256 printf( n in interactive modern"); 

257 exit(-1); 

258 } 
259 

260 unknowns = argv(1]; 

261 models - argv(2]; 

262 if(argc = =4){ 

263 batch = TRUE; 

264 words = argv[3]; 

265 } else 

266 batch = FALSE; 
267 

268 unknownDict = ReadDictionary(unknowns); 

269 modelDict = ReadDictionary(models); 
270 

271 if (batch) { 

272 FILE *fp; 
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273 if ((fp = fopen(word$,"r"))= =NULL) 

274 DoError("l2Norm: can't open input file °/os.\n , words); 

275 while (!done){ 

276 fgets(s,MAX STRING LENGTH,fp); 

277 if((sI0]= = r \0')||(s[O]=='\n'» 

278 done = TRUE; 

279 else{ 

280 crPointer - strchr(s/\n'); 

281 if (crPointer != NULL) 

282 *crPointer = '\0'; 

283 printf("%s: " f s); 

284 L2CompareDictToString(unknownDict # s ( modelDict,TRUE); 

285 } 

286 } 

287 } 

288 else { 

289 while (ldone){ 

290 printfCEnter a word to search for: "); 

291 fgets(s,MAX STRING LENGTH,stdin); 

292 if((s[0]=='\0')||(s[0]=='\n')) 

293 done = TRUE; 

294 else { 

295 crPointer = strchr(s,'\n'); 

296 if (crPointer != NULL) 

297 'crPointer = '\0'; 

298 printfC'Comparing shape %s to the imageW.s); 

299 L2CompareDictToString(unknownDict,s # modelDict,FALSE); 

300 } 

301 } 

302 } 

303 } 
304 
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1 #include <stdio.h> 

2 #include <values.h> 

3 #include <string.h> 

4 #include "boolean.h - 

5 #indude "types.h" 

6 #include "error.h" 

7 #include"dict.h" 
8 

9 #def ine MAX STRING LENGTH 30 

10 #def ine MAX SIGNAL LENGTH 300 

11 #defineMAX SHIFT 10 
12 

13 #define MIN(a,b) ((a)<(b)?(a):(b)) 

14 #define MAX(a f b) ((a)>(b)?(a):(b)) 
15 

16 typedef struct { 

17 int numberOfSymbols; 

18 intsignalLength; 

19 float *costs; 

20 } *CorrelationSet,CorrelationSetBody; 
21 

22 CorrelationSet NewCorrelationSet(int numberOfSymbols,int signalLength) 

23 { 

24 CorrelationSet temp; 

25 if ((temp = (CorrelationSet)calloc(1,si2eof(CorrelationSetBody)))= =NULL) 

26 DoError{"NewCorrelationSet: cannot allocate spaceAn", NULL); 

27 temp- > numberOfSymbols = numberOfSymbols; 

28 temp- > signalLength = signalLength; 

29 if ((temp->costs = (float *)calloc(numberOfSymbo1s*signalLength,sizeof(float))) = = NULL) 

30 DoError("NewCorrelationSet: cannot allocate $paceAn",NULL); 

31 return temp; 

32 } 
33 

34 #ifdeffoo 

35 float CorrelationValue(CorrelationSet cjnt symboljnt offset) 

36 { 

37 return *(c- > costs + symbol *c-> signalLength + off set); 

38 } 
39 

40 void SetCorrelationValue^orrelationSetcintsymbol^ntoffset.float value) 

41 { 

42 *(c->costs+symbol*c->signalLength+offset) = value; 

43 } 

44 #endif 

45 #define CorrelationValue(c,s,o) (*((c)-> costs + (s)*(c>-> signalLength + (o))) 

46 #defmeSetCorrelationValue(c,s,o,v)(*(^ 
47 

48 int CorrelationSetSize{CorrelationSet c) 

49 { 

50 retu rn c- > numberOfSymbols; 

51 } 
52 
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53 int CorrelationSetWidth{CorrelationSet c) 

54 { 

55 return c->signalLength; 

56 } 
57 

58 void PrintCorrelation(CorrelationSet c ( int character) 

59 { 

60 inti; 
61 

62 for(i=0;i<c->signallength;++i){ 

63 printf("%d:%6.2f\n M f i/(c->costs+character*c->signalLeagth+i)); 

64 } 

65 printfOn"); 

66 } 
67 

68 float L2Norm(OutlinePair signal, int startOffset, 

69 OutlinePair model) 

70 { 

71 float *top1,*top2,*bottom1,*bottom2; 

72 int i,overlap; 

73 float sum; 

74 float temp; 
75 

76 if ((startOff set < 0) || 

77 (startOffset + model- > numberOf Legs > signal->numberOfLegs)) 

78 DoErrorCL2Norm: the model must overlap the signaIAn",NULL); 
79 

80 topi = signal->top+startOffset; 

81 top2 = model- > top; 

82 bottoml = signa I- > bottom + startOff set; 

83 bottom2 = model- > bottom; 
84 

85 overlap = signal-> numberOf Legs - startOffset; 

86 if (overlap > model- >numberOf Legs) 

87 overlap = model- > numberOf Legs; 
88 

89 for(i = 0,sum=0;i<overlap; + +i){ 

90 temp = *top1 + + - *top2+ + ; 

9 1 sum + = temp * temp; 

92 temp = * bottom 1 + 4- - *bottom2 + + ; 

93 sum + = temp * temp; 

94 } 
95 

96 return sum; 

97 } 
98 

99 CorrelationSet PreProcessSignalWithChars(OutlinePair signal,Dictionary charDict) 

100 { 

101 CorrelationSet cSet; 

102 intthisChar,offset; 

103 OutlinePair charSignal; 
104 

105 cSet s= NewCorrelationSet(charDict->numberOfEntries ( signal->numberOfLegs); 
106 

107 for(thisChar = 0;thisChar < charDirt-> numberOf Entries; 4- 4-thisChar) { 
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108 charSignal = *(charDict->outlines+thisChar); 

109 for (offset = 0; offset < signal->numberOfLegs-charSignal-> numberOflegs+ 1 ; 
+ + offset) 

110 SetCorrelationValue(cSet r thisChar,offset f L2Norm(signal j offset l char5ignal)); 

111 } 

112 return cSet; 

113 } 
114 

115 CorrelationSet *PreProcessDictionaryWithChars(Dictionary signalDict # Dictionary charDict) 

116 { 

1 1 7 CorrelationSet * correlation Sets; 

118 intthisWord; 
119 

120 correlationSets = (CorrelationSet 
*)calloc(signalDict->numberOfEntries,sizeof(CorrelationSet}); 

121 if (correlationSets = = NULL) 

122 DoError( u PreProcessDicitonary: cannot allocate space.\n\NULL); 

123 for (thisWord = 0; thisWord < signalDict->numberOfEntries; + + thisWord) { 

124 * (correlationSets 4* thisWord) = 
PreProcessSignalWithChars(*(signalDict->outlines+thisWord),charDict); 

125 printf( w %d M ,thisWord); 

126 ] 

1 27 return correlationSets; 

128 } 
129 

130 CorrelationSet PreProcessSignaIWithBlanks(OutlinePair signal) 

131 { 

1 32 CorrelationSet cSet; 

133 intblankWidth.offset; 

134 intnumberOfLegs = signal-> numberOfLegs; 
135 

136 cSet = IMewCorrelationSet(MAX SHIFT,numberOfLegs); 
137 

138 for (offset = 0; offset < numberOfLegs ; + + offset) { 

139 SetCorreIationValue(cSet # 0,offset,0); 

140 } 

141 for (offset = 0; offset < numberOfLegs ; + + offset) { 

142 float top.bottom; 

143 top = *(signal->top+offset); 

144 bottom = *(signal->bottom + off$et); 

145 SetCorrelationValue(cSet,1,offset r top*topH-bottom*bottom); 

146 } 

147 for (blankWidth = 2; blankWidth < MAX.SHIFT; + +blankWidth) { 

148 for (offset = 0; offset < numberOfLegs-biankWidth+ 1 ; + + offset) { 

1 49 float top,bottom,temp; 

150 top = *(signal->top + offset+blankWidth-1); 

151 bottom = *(signal->bottom+offset4-blankWidth-1); 

152 temp = top*top + bottom*bottom + CorrelationValue(cSetblankWidth-1,offset); 

153 SetCorrelationValuefcSe^blankWidth.offseUemp); 

154 } 

155 } 

156 return cSet; 

157 } 
158 

1 59 CorrelationSet *PreProcessDictionaryWithBlanks(Dictionary signalDict) 
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160 { 

161 CorrelationSet 'correlations; 

162 int thisWord; 
163 

164 correlations = (CorrelationSet 

165 *) ca N° c ( si g na lDict-> numberOfEntries.sizeof (CorrelationSet)); 

166 for (thisWord = 0; thisWord < signalDict->numberOf Entries; + -f thisWord) { 

167 *(correlations+thisWord) = 
PreProcessSignalWithBlanks(*(signalDict->outlines + thisWord)); 

168 printff%d ",thisWord); 

169 } 

170 return correlations; 

171 } 
172 

173 int LookupShapelndex(char c, Dictionary models) 

174 { 

175 /* dictionary file has the following order: 

176 ABCDEFGHIJKLMNOPQRSTUVWXYZ 

177 abcdefghijklmnopqrstuwvxyz 
178. 0123456789 

179 

180 */ 

181 int shapelndex; 

182 if <(c>='a'&&c< = 'z')) 

183 shapelndex = c-'a'; 

184 else if (c == 7) 

185 shapelndex = 26; 

186 elseif(c = =7) 

187 shapelndex = 27; 

188 else 

189 DoErrorfLookupShape: have no shape one of the characters.\n N ,NULL); 

1 90 return shapelndex; 

191 } 
192 

193 

194 float L2CompareWithString(int signallndex, 

135 char*string, 

1 96 CorrelationSet charCorrelations, 

1 97 CorrelationSet blankCorrelations, 

198 Dictionary signalDict, 
1" Dictionary models) 

200 { 

201 /* Allocate space for dynamic programming array. */ 

202 /* For now, be a space hog, */ 

203 float costMatrix[MAX_SIGNALJ.ENGTH][MAX STRING LENGTH]; 

204 int pathMatrix[MAX.SIGNAL„LENGTH][MAX_STRING LENGTH]; 

205 char *charCursor; 

206 OutlinePairmodelShapesIMAXJTRING LENGTH); 

207 int modellndices[MAX_STRING J.ENGTH]; 

208 int numberOf Chars; 
209 

210 int letterlndex, startOffset; 

211 int left,right; 

212 intsearchLeft,searchRight,rightEdge; 
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213 float minValue; 

214 float temp; 

215 inti,oldLeft,oldRight,minlndex; 

216 int signal Length; 
217 

218 signalLength = (M$ignalDict->outlines+signallndex))->numberOfLegs; 
219 

220 /* Make sure input string is not too long. */ 

221 numberOfChars = strlen (string); 

222 if (numberOfChars > = MAX.STRINGJ.ENGTH) 

223 DoError( M L2CompareWithString: string istoo !ongAn",NULL); 
224 

225 /* Make sure signal is not too long. */ 

226 if (signalLength > = M AX_SIG N ALJ.EN GTH) 

227 Do£rror(''L2CompareWithString: signal is too long.W.NULL); 
228 

229 /* Lookup the indices of the signals corresponding to the characters in the string. */ 

230 charCursor = string; 

231 for(i=0;i<numberOfChars; + +i) { 

232 modellndkesli] = LookupShapelndex(*charCursor+ +,models); 

233 modelShapesfi] = *(models-> outlines* modellndices[i]); 

234 } 
235 

236 /* Since the cost matrix is larger than the region containing valid 

237 * alignments, first fill in the array with large costs. Later, some 

238 * of these will be overwritten. */ 

239 /* WARNING: does MAXFLOAT + smallFloat = = MAXFLOAT or does it roll? */ 

240 { 

241 float *cursor; 

242 cursor = &(costMatrix[0][0)); 

243 for (i=0;i<MAX_SlGNAL_LENGTH*MAX_STR1NGJ.ENGTH; + +i) 

244 *cursor+ + = MAXFLOAT; 

245 } 
246 

247 /* Fill in leftmost column */ 

248 left = 0; 

249 right = MIN(MAX_SHIFT ( signalLength-modelShapesl0]->numberOfLegs); 

250 for (startOffset=left;startOffset< right; + +startOffset) 

251 costMatrix[startOffset][0] = CorrelationValuefblankCorrelations^tartOffsetstartOffset) 
+ 

252 CorrelationValueUharCorrelations^odellndiceslOl.startOffset); 
253 

254 /* Now do the rest of the columns */ 

255 for (letterlndex = 1; letterlndex < numberOfChars; + + letterlndex) { 

256 oldLeft = left; 

257 oldRight = right; 

258 left + = modelShapes[letterlndex-1]-> numberOf Legs; 

259 /* If string of characters is too long for this signal, abort by returning a large cost. */ 

260 if (left> = signalLength) 

261 return MAXFLOAT; 

262 right + = model5hapesiletterlndex-1l-> numberOf Legs + MAX.SHIFT; 

263 right = MIN(right,signalLength-modelShapes[letterlndexl-> numberOf Legs + 1); 
264 

265 for (startOffset= lef t;startOffset< right; + +startOffset) { 

266 temp = CorrelationValue(charCorrelations r modellndices[letterlndex],startOffset); 
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267 

268 /* This could be made quite a bit faster since for each start offset, 

269 * we just add an element to the set we are minimizing over. */ 
270 

271 searchLeft = startOffset-modelShapes[letterlndex-1]->numberOfLegs-MAX SHIFT; 

272 searchLeft = MAX(searchLeft,oldLeft); 

273 rightEdge = searchLeft+modelShapes[letterlndex-1]->numberOfLegs; 

274 searchRight = startOffset-modelShapes[letterlndex-1]-> numberOf Legs; 

275 searchRight = MINfsearchRight.oldRight); 
276 

277 minindex = searchLeft; 

278 minValue = costMatrix[search Left][letter Index- 1] + 

279 CorrelationValue(blankCorrelations,startOffset-rightEdge,rightEdge); 

281 f o r (i = sea rchLeft; i < search Right; + + i, + + rightEdge) { 

282 float temp; 

283 temp = costMatrix[i][letterlndex-1] + 

284 CorrelationValue(blankCorrelations,startOffset-rightEdge J rightEdge); 

285 if (temp < minValue) { 

286 minindex = i; 

287 minValue = temp; 

288 } 

289 } 
290 

291 costMatrix[startOffset][letterlndex] = minValue + temp; 

292 pathMatrix[startOffset]|letterlndex] = minindex; 

293 }/*forstartOffset*/ 

294 }/*forletterlndex*/ 
295 

296 

297 f* fill in the costs for blanks at the end of the word */ 

298 rightEdge = left + modelShapes[letterlndex-1]-> numberOf Legs; 

299 for (startOffset= left;startOffset< right; + +startOffset, + + rightEdge) { 

300 if (rightEdge + M AX.SHIFT > = signalLength) { 

301 costMatrix[startOffset][letterlndex-1] + = 

302 CorrelationValue(blankCorrelations,signalLength-1-rightEdge ( nghtEdge); 

303 ) 

304 else { 

305 /* this chain of letters does not span the word */ 

306 costMatrix[startOffset][letterlndex-1l = MAXFLOAT; 

307 } 

308 } 
309 

310 /* keep minindex fordebugging pruposes */ 

311 minindex = left; 

312 minValue - costMatrix[left][letterlndex-1]; 

313 for (i= left; i< right; + +i){ 

314 float temp; 

315 temp = costMatrix!i][letterlndex-1]; 

316 if (temp < minValue) { 

317 minindex = i; 

318 minValue = temp; 

319 } 

320 ) 
321 
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322 return minValue; 

323 } 
324 

325 void PrintPathflnt *pm, int width, int height, int index) 

326 { 

327 int i; 

328 for (i = width-1 ; i > = OH) { 

329 printfr%d M f index); 

330 index = *(pm+index*width + i); 

331 } 

332 printf("\n u ); 

333 } 
334 

335 void PrintMatrix(float *m,int width, int height) 

336 { 

337 int i; 

338 intj; 

339 for (i = 0; Kheight; + + i) { 

340 printf("%d: °,i); 

341 f or (j =0;j< width; + +j) 

342 printf("%6.2e Vm + +); 

343 printfnn"); 

344 } 

345 } 
346 

347 typedef struct CTuple { 

348 int index; 

349 float value; 

350 } CompareTuple; 
351 

352 intTupleLes$Than(CompareTuple *t1, CompareTuple *t2) 

353 { 

354 return t1->value > t2->value; 

355 } 
356 

357 voidL2CompareDictToString(char *string r 

358 CorrelationSet * ch a rCor relations, 

359 CorrelationSet *blankCorrelations, 

360 Dictionary signalDiet, 

361 Dictionary modelDict, 

362 BOOLEAN isBatch) 

363 { 

364 CompareTuple* results; 

365 inti; 
366 

367 if ((results = (CompareTuple *)calloc(signalDict->numberOf Entries, 

368 sizeof(CompareTuple))) = = 

369 NULL) 

370 DoError( w L2CompareDictToString: cannot allocate space.\n", NULL); 
371 

372 for (i = 0; KsignalDict-> numberOf Entries; + +i) t 

373 (*(resulu+i)).index = i; 

374 (*(results+i)). value = L2CompareWithString(i,string, 

375 *(charCorreiations+i), 
375 *(blankCorrelations+ i). 
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377 signalDict, 

378 modelDict); 

379 } 
380 

381 qsort(results, 

382 signalDict->numberOf Entries, 

383 sizeof(CompareTuple), 

384 TupleLessThan); 
385 

386 if (isBatch) { 

387 printf( li %d(%f)\n" l (*results).index,(*results).value); 

388 } 

389 else { 

390 printfnrT); 

391 for(i = 0;(t<5)&&(i<signalDict->numberOfEntries);++i) 

392 printf( M %d; %f\n" l (*(results+i))Jndex,(*(results+ i)).value); 

393 printf( w \n M ); 

394 } 
395 

396 free(results); 

397 } 
398 

399 void PrintDictStats(Dictionary diet) 

400 { 

401 int i # sum = 0; 

402 printf ("Dictionary has %d entries.\n M r dict->numberOf Entries); 

403 f or (i = 0; i < diet- > numberOf Entries; + + i) 

404 sum + = (*(dict->outlines + i))->numberOfLegs; 

405 printf ("The total length of the shape contours is %d pixeIs.\n M ,sum); 

406 } 
407 

408 void main(int argechar **argv) 

409 { 

410 char * unknowns/models; 

41 1 char s[MAX_STRING_LENGTH-H]; 

41 2 Dictionary unknownDict, modelDict; 

413 int selection; 

414 char*crPointer; 

41 5 BOOLEAN done = FALSE; 

416 BOOLEAN batch; 

417 char* words; 

418 CorrelationSet *charCorrelations; 

419 CorrelationSet *blankCorrelations; 
420 

421 if(argc!= 3&&argc!=4){ 

422 printf("U$age:\n"); 

423 printf(" %s<unknowns> <alphabet> [<batch wordlist>]\n M ,argv[0]); 

424 printf (" If the batch file is not specified, the program runs\n k ); 

425 printfC in interactive modern"); 

426 exit(-l); 

427 } 
428 

429 unknowns = argvdl; 

430 models = argv[2]; 

431 if(argc = = 4){ 
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432 batch = TRUE; 

433 words = argv[3J; 

434 } else 

435 batch = FALSE; 
436 

437 unknownDict = ReadDictionary(unknowns); 

438 modelDict - ReadDictionary(models); 
439 

440 PrintDictStats(unknownDict); 

441 printf( w Preprocessing . . An w ); 

442 charCorrelations = PreProcessDictionaryWithCharstunknownDia^odelDict); 

443 blankCorrelations = PreProcessDictionaryWithBlanks(unknownDict); 

444 printfrdone.\n M ); 
445 

446 if (batch) { 

447 FILE *f p; 

448 if ((fp = fopen(words, "r")) = =NULL) 

449 DoError("l2Norm: can't open input file % sAn", words) ; 

450 while (!done){ 

451 fgets(s,MAX_STRING LENGTH,fp); 

452 if «s[0] = = '\0') || (slOl = = An-)) 

453 done = TRUE; 

454 else{ 

455 crPointer = strchr(s,'\n'); 

456 if (crPointer ! = NULL) 

457 *crPointer = '\0'; 

458 printf("%s: u ,s); 
459 

L2CompareDictTo5tring(s,charCorreIations,blankCorrelations,unknownDict / mode!Dic 
tJRUE); 

460 } 

461 } 

462 } 

463 else{ 

464 while (Id one) { 

465 printf (" Enter a word to search for: M ); 

466 fgets(s # MAX STRING LENGTH,stdin); 

467 if((s[0]=='\0')H(s[0]=='\n')) 

468 done'= TRUE; 

469 else { 

470 crPointer = strchr(s,'\n'); 

471 if (crPointer != NULL) 

472 *crPointer = 'V0'; 

473 printfC'Comparing shape °/fistotheimage\n\s); 
474 

L2CompareDictToString(s,charCorrelations,blankCorrelations,unknownDict,modelDic 
t, FALSE); 

475 } 

476 ) 

477 } 

478 } 
479 
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Jan 15 21:32 1991 match.c 

1 /* 

2 * match.c 

3 * align 2 sequences 
4 

5 * run as: match seql seq2 

6 * 

7 V 
8 

9 /* 

10 * TO DO: 1) don't compute over parts of array outside of constraints 

11 * 2) distance score for top and bottom paths 

12 */ 
13 

14 #include <stdio.h> 

15 #include <math.h> 
16 

17 #include "boolean.h" 

18 #include "types.h" 

19 ^clude'error-h" 

20 #include <, picth" 

21 #include ,, dict.h w 

22 #include "drfth" 

23 #include "diff 2.h M 

24 iPindude'match.h" 
25 

26 #if ndef MAXLINE 

27 #define MAXLINE 256 

28 #endif 
29 

30 #ifndef MAXNAME 

31 #define MAXNAME 128 

32 #endif 
33 

34 #if ndef TRUE 

35 #defineTRUE1 

36 #endif 
37 

38 #ifndef FALSE 

39 #define FALSE 0 

40 #endif 
41 

42 int matchcntr = 1 ; /* used for writing out set number of matches */ 
43 

44 /* 

45 void 

46 main(argc,argv) 

47 intargc; 

48 char*argv[]; 

49 { 

50 */ 

51 /* 

52 * read in multiple parameter files, write out selected fields as shorts 
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53 */ 

54 /* 

55 intij; 

56 intseqlength; 

57 float matchvecsO; 
58 

59 float testIM AXSEQLENGTH]; 

60 float reflMAXSEQLENGTH]; 

61 */ 

62 /* 

63 *readinargs 

64 */ 

65 /* 

66 debug = FALSE; 
67 

68 for (;argc > 1 && (argv[1][0] = = '-'); argc--, argv+ +) 

69 { 

70 switch (argv[1 

71 { 

72 case'd': 

73 debug = TRUE; 

74 break; 

75 caseV: 

76 horweight = (float)atoi(&argv[1][2j); 

77 break; 

78 caseV: 

79 verweight = {float)atoi(&argv[1][2j); 

80 break; 

81 caseV: 

82 diagweight = (float)atoi(&argv[1][2]); 

83 break; 

84 defauft: 

85 printf ("match: unknown switch %s.\n° ( argv[1]); 

86 exit(1); 

87 } 

88 } 
89 

90 if(argc!=1) 

91 { 

92 printf ("Usage: match [-b<begsamp> -d(debug)-e<endsamp>\n"); 

93 printf ("argc: %d\n\ argc); 

94 exit(1); 

95 } 

96 */ 

97 /* debugging */ 

98 /* for(i = 0; i < 5; i + + ) 

99 test[i] = (float)i; 

100 for (i = 5; i < 10; i + 

101 test[i] = (float)(.5*(i-4) + 5); 

102 for(l = 0;i< 5; 

103 refti] = 15* I; 

1 04 matchvecs(test, 1 0, ref , 5); 
105 

106 } 

107 */ 
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108 

109 /* 

110 float DPDiffPair(OutlinePair one, OutlinePair two) 

111 ( 

112 if (one — = two){ 

113 printf( H matches\n w ); 

114 return(O.O); 

115 } 

116 else{ 

117 printf("nomatch\n w ); 

118 return(I.O); 

119 } 

120 } 

121 */ 
122 

123 float DPDiffPair(OutlinePair one, OutlinePair two) 

124 /* 

1 25 * question, should top and bottom distance be forced to be computed together? 

1 26 * use another distance score to check how far off the two are? 

127 */ 
128 

129 { 

130 float topscore; 

131 float bottomscore; 
132 

133 if (debug) printfCtop: w ); 

134 topscore = matchvecs(one->top, one->numberOfLegs, 

135 two- > to p, two- > n u mb erOf Legs) ; 

136 if (debug) printfC bottom: "); 

137 bottomscore = matchvecs(one->bottom, one->numberOfLegs, 

1 38 two-> bottom, two- > numberOf Legs); 

139 return (topscore + bottomscore); 

140 } 
141 

142 float matchvecs(float *Vec1, int lenVed, float *Vec2, int lenVec2) 

143 y* 

144 * Computes the best path between one and two. 

145 * Allows 2/1 expansion/compression 

146 */ 

147 { 

148 float dist, mindist, nor, vert, diag; 

149 float bestscore; 

150 inti1,i2; 

151 intxdir, ydir; 
152 

1 53 elt *array[MAXSEQLENGTH][MAXSEQLEN6TH]; 

154 elt *aelt; 
155 

156 /* initialize array */ 
157 

158 for (M = 0; i1 < lenVed; i1 + +){ 

159 for (i2 = 0; 12 < lenVec2; i2 + + ){ 

160 array[i1](i2] = (elt*) malloc(sizeof (elt)); 

161 if (array[i11[i2] == NULL){ 

162 fprintf(stderr, 
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163 "Sorry, not enough space to malloc array elts in pi matchvecs\n BYE!"); 

164 exit(1); 

165 } 

166 } 

167 } 
168 

169 /* 

170 * compute match 

171 */ 

172 /* initialize*/ 

173 aelt = arraylOHO]; 

174 aelt->cost = sq_distance{Ved[0], Vec2[0]); 

175 aelt->xptr = 0;~ 

176 aelt->yptr = 0; 

177 /* bottom row */ 

178 i2 = O; 

179 for (il = 1; 11 < lenVed; i1 + +) { 

180 dist = sq_distance(Vec1(i1], Vec2[i2]); 

181 aelt = array[i1I[i2l; 

182 aelt->cost = array[i1 - 1](i2]->cost + horweight * dist; 

183 aelt->xptr= -1; 

184 aelt->yptr = 0; 

185 } 

186 /* left column*/ 

187 i1 = 0; 

188 for(i2 = 1; i2 < lenVec2; i2++){ 

189 dist = sqjJistance(Vec1[i1],Vec2[i2]); 

190 aelt = array[M][i2]; 

191 aelt->cost = arrayli1][i2 - 1J->cost + verweight* dist; 

192 aelt->xptr = 0; 

193 aelt->yptr = -1; 

194 } 

195 /* middle*/ 

196 for(i1 = 1;i1 < lenVed; i1 + +){ 

197 for(i2= 1; i2 < lenVec2; i2+ + ){ 

198 dist = sq_distance(VecUi1h Vec2[i2l); 

199 hor = arrayli! - 1][i2]->cost + horweight * dist; 

200 xdir = -1; 

201 ydir = 0; 

202 mindist = hor; 

203 vert = array|i1][i2- 1]->cost + verweight * dist; 

204 if (vert < mindist) { 

205 xdir = 0; 

206 ydir = -1 ; 

207 mindist = vert; 

208 } 

209 diag = array[i1 - 1]li2- 1]->cost + diagweight * dist; 

210 if (diag < mindist) { 

211 xdir = -1; 

212 ydir = -1; 

213 mindist = diag; 

214 } 

215 aelt = array[i1][i2]; 

216 aelt-> cost = mindist; 

217 aelt-> xptr = xdir; 
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218 aelt-> yptr = ydir; 

219 } 

220 } 
221 

222 bestscore = best_score(array, lenVed, lenVec2); 

223 #ifdef foo 

224 if (debug) { 

225 printjaestjaath (array, lenVecl, lenVec2); 

226 /* print_array_costs(array, tenVed, lenVec2); 

227 print array dirs(array, lenVecl, lenVec2); 

228 */ 

229 printffbestscore: %f\n", bestscore); 

230 } 

231 #endif 
232 

233 for(i1 = 0; i1 < lenVed; i1 + +) { 

234 for(i2 = 0;i2<lenVec2;i2++){ 

235 free(array[i1][i2]); 

236 } 

237 } 
238 

239 return (bestscore); 

240 } 
241 

242 float 

243 sq_distance(floatx1, float x2) 

244 { 

245 float dist; 

246 float epsilon = .001; 
247 

248 /* 

249 * quantization makes many values identical, use of epsilon encourages shortest path 

250 */ 
251 

252 dist = x1-x2; 

253 dist *= dist; 

254 dist += epsilon; 

255 return(dist); 

256 ) 
257 

258 /* 

259 float parallel distance(OutlinePair one ( OutiinePair two, int ptrl, int ptr2) 

260 { 

261 float topdist, bottomd 1st; 
262 

263 topdist = one->top[ptr1] -two->top[ptr2]; 

264 topdist *= topdist; 
265 

266 bottomdist = one->bottom[ptr1] -two->bottomlptr2]; 

267 bottomdist * = bottomdist; 
268 

269 return(topdist + bottomdist); 

270 } 

271 */ 
272 
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273 float 

274 best score (e!t *arrayQ|MAXSEQLENGTH], int lenVed, int !enVec2) 

275 { 

276 /* assume alt of Vec1 and Vec2 are used, so just take value at end */ 
277 

278 return(arrayllenVed - 1][lenVec2- 1]->cost); 

279 } 
280 

281 /* 

282 * ********************** 

283 * debugging functions 
2B4 */ 

285 #ifdeffoo 

286 void 

287 print_best_path(elt *array[](MAXSEQLENGTH] ( int lenVed, int lenVec2) 

288 { 

289 char pathlMAXNAME]; 

290 intx,y; 

291 eit*aelt; 

292 FILE *ofp; 
293 

294 x- lenVed -1; 

295 y = lenVec2-1; 
296 

297 sprintf(path ( "/net/piglet/piglet/speech/fchen/pics/paths/pyod.txt", FileCountY); 
298 

299 ofp = fopen (path, "a"); 

300 if (ofp = = NULL) 

301 printf ("Cannot open output file %s.\n M , path); 
302 

303 /* fprintf(ofp, " °/o3s %3s %6s\n", "x", p y'\ "cost"); 

304 */ 

305 while (x> 0|| y>0){ 

306 aelt = arrayMIy]; 

307 fprintf(ofp, " %3d %3d %6.2f\n" ( x, y, aelt->cost); 

308 x + = aelt->xptr; 

309 y + = aelt->yptr; 

310 } 

311 /* fprintf(ofp, 11 \° match °/cd\n\n'\ matchcntr+ +); 

312 */ 

313 fprintf(ofp, M \ u match %d %d\n\n", FileCountX, FileCountY); 

314 fdose(ofp); 

315 } 

316 #endif 

317 static float sqr(floatx) 

318 { 

319 return x*x; 

320 } 
321 

322 void print_best_path(elt*array[][MAX5EQLENGTH), int lenVed, int lenVec2, 

323 char *outFileName) 

324 { 

325 int x, y; 

326 elt*aelt; 

327 FILE *outFile; 
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328 float dist = 0; 
329 

330 x= lenVed -1; 

331 y=lenVec2-1; 
332 

333 if ((outFile = (FILE *)fopen{outFileName,"w M ))= =NULL) 

334 DoErrorCsingle: Cannot open output file %s.\n*\ outFileName); 
335 

336 while (x >0 1| y>0){ 

337 aelt = array[x][y]; 

338 fprintf(outFile, " %3d %3d %6.2f\n", x, y, aelt->cost); 

339 dist + = sqrt(sqr(aelt->xptr)+sqr(aelt->yptr)); 

340 x + = aelt->xptr; 

341 y + = aelt->yptr; 

342 } 

343 fclose(outFile); 

344 printf("distance = %f\n w ,dist); 

345 } 
346 

347 

348 void 

349 print_array_costs(eIt *array[][MAXSEQLENGTHL int lenVed, int lenVec2) 

350 { 
351 

352 intx, y; 
353 

354 for(y = 0; y < lenVec2; y+ +){ 

355 for (x = 0; x < lenVed; x+ + ){ 

356 printf("%7.2f M ( array[x][y]->cost); 

357 } 

358 printfruT); 

359 } 

360 } 
361 

362 void 

363 print_array_dirs(elt *array[)[MAXSEQLENGTH], int lenVed, int lenVec2) 

364 { 
365 

366 intx, y; 
367 

368 for (y = 0; y < 1enVec2; y + +){ 

369 for (x = 0; x < lenVed; x+ +){ 

370 printf("%2d:%2d " , array(x][y]->xptr, array[x][y]->yptr); 

371 } 

372 printfC\n"j; 

373 } 

374 } 
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Jul 7 14:28 1991 matchparallel.c 



1 /* 

2 * matchparallel.c 

3 * align 2 sequences 

4 * 

5 * dependent on match.c 

6 */ 
7 

8 /* 

9 * TO DO: 1) don't compute over parts of array outside of constraints 

10 * 

11 */ 
12 

13 #include <stdio.h> 

14 #include <math.h> 

15 #include "booiean.h" 

16 #include "types.h" 

17 #indude "error.h" 

18 #include "picth" 

19 #indude "dicth" 

20 #indude "diff.h" 

21 #indude "diff2.h M 

22 #indude M match.h w 

23 #indude H matchparal!el.h w 
24 

25 #ifndef MAXLINE 

26 #def ine MAXLINE 256 

27 #endif 
28 

29 #ifndef MAXNAME 

30 #def ine MAXNAME 1 28 

31 #endif 
32 

33 #ifndefTRUE 

34 #def ine TRUE 1 

35 #endif 
36 

37 #ifndef FALSE 

38 #def ine FALSE 0 

39 #endif 
40 

41 #ifndefmax 

42 #def ine max(a,b) ((a) > (b) ? (a) : (b)) 

43 #endif 
44 

45 #ifndefmin 

46 #def ine min(a,b) ((a) < (b) ? (a) : (b)) 

47 #endif 
48 

49 /* 

IJQ ********************* 

51 * parallel match with full search 

CI ********************* 
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53 */ 
54 

55 float pi DPDiffPair(OutlinePair one,OutlinePairtwo, char *pathFile) 

56 /* 

57 * question, should top and bottom distance be forced to be computed together? 

58 * use another distance score to check how far off the two are? 

59 */ 
60 

61 { 

62 float score; 
63 

64 score = pl_matchvecs(one->top, one-> bottom, one->numberOfLegs, 

65 two->top, two- > bottom, two- >numberOf Legs, 

66 pathFile); 

67 return (score); 

68 } 
69 

70 float pLmatchvecs(float *Vec1t, float *Vec1b, int lenVed, 

71 float *Vec2t, float *Vec2b, int lenVec2, 

72 char*pathFile) 

73 /* 

74 * Computes the best path between one and two. 

75 * Allows 2/1 expansion/compression 

76 */ 

77 { 

78 float dist, mindist, nor, vert, diag; 

79 float bestscore; 

80 inti1,i2; 

81 intxdir, ydir; 
82 

83 elt *array[MAXSEQLENGTH)[MAXSEQLENGTH]; 

84 elt*aelt; 
85 

86 /* initialize array */ 
87 

88 for(i1 = 0;i1 < lenVed; i1 + +) { 

89 for (i2 ~ 0; i2 < lenVec2; i2+ +) { 

90 array[i1][i2] = (elt *) malloc(si2eof (elt)); 

91 if (arrayli1][i2] == NULL){ 

92 fprintf(stderr, 

93 "Sorry, not enough space to malloc array elts in pi matchvecs\n BYEi M ); 

94 exit(1); 

95 } 

96 } 

97 } 
98 

99 /* 

100 * compute match 

101 */ 

102 /* initialize*/ 

103 aelt = array[0][0]; 

104 aelt->cost = sq_distance(Vedt[0], Vec2t[0]) + sq_distance(Vec1b[0], Vec2b[0]); 

105 aelt->xptr = 0; 

106 aelt->yptr = 0; 

107 /* bottom row*/ 
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53 */ 
54 

55 float pi DPDiff Pair(OutlinePair one, OutlinePair two, char *pathFile) 

56 /* 

57 * question, should top and bottom distance be forced to be computed together? 

58 * use another distance score to check how far off the two are? 

59 */ 
60 

61 { 

62 float score; 
63 

64 score = pljnatchvecs(one->top, one-> bottom, one->numberOfLegs, 

65 two->top, two- > bottom, two- >numberOf Legs, 

66 pathFile); 

67 return (score); 

68 } 
69 

70 float p"Lmatchvecs(float *Vedt, float *Vec1b, int lenVed, 

71 float *Vec2t, float *Vec2b, int IenVec2, 

72 char*pathFile) 

73 /* 

74 * Computes the best path between one and two. 

75 * Allows 2/1 expansion/compression 

76 */ 

77 { 

78 float dist, mindist, hor, vert, diag; 

79 float bestscore; 

80 inti1,i2; 

81 intxdir, ydir; 
82 

83 elt *array[MAXSEQLENGTH][MAXSEQLENGTH]; 

84 elt*aelt; 
85 

86 /* initialize array */ 
87 

88 for(i1 = 0; i1 < lenVed; i1 + + ){ 

89 for (i2 = 0; i2 < lenVec2; i2+ +) { 

90 array[i1]Ii2] = (elt *) malloc(sizeof (elt)); 

91 if (array(i1][i2] == NULL){ 

92 fprintf(stderr, 

93 "Sorry, not enough space to malloc array elts in pl_matchvecs\n BYE! U ); 

94 exit(1); 

95 ) 

96 } 

97 } 
98 

99 /* 

100 * compute match 

101 */ 

102 /* initialize*/ 

103 aelt = array(0][0]; 

104 aelt->cost = sq_distance(Vedt[0], Vec2t[0]) + sq_distance(Vec1b[0], Vec2b[0]); 

105 aelt->xptr = 0; 

106 aelt->yptr = 0; 

107 /* bottom row*/ 
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108 i2 = 0; 

. 109 for (i1 = 1; iK lenVed; i1 ++) { 

1 10 dist = sq_distance(Vecit[i1], Vec2t(i2)) + sq distance(VedbIi1], Vec2b[i2]); 

111 aelt = array[i1](i2]; 

1 12 aelt->cost = array[i1 - 1][i2]->cost + horweight * dist; 

113 aelt->xptr = -1; 

114 aelt->yptr = 0; 

115 } 

116 /* left column*/ 

117 i1=0; 

118 for(i2= 1;i2<lenVec2;i2 + +){ 

119 dist = sq_distance(Vec1t(i1J, Vec2t(i2]) + sq distance(Vedb[i1 J, Vec2bli2]); 

120 aelt = array Ii1][i2); 

121 aelt->cost = array[i1][i2 - 1]->cost + verweight * dist; 

122 aelt->xptr= 0; 

123 aelt->yptr = -1; 

124 } 

125 /* middle*/ 

126 for(i1 = 1; i1 <lenVed; i1 + +){ 

127 for (i2 = 1; i2 < lenVec2; i2+ +) { 

128 dist = scL_distance(Vec1t[M] # Vec2t[i2]) + sq_distance(Vec1b[i1], Vec2b[i2]); 

129 hor = array[i1 - 1][i2]->cost + horweight * dist; 

130 xdir = -1; 

131 ydir = 0; 

132 mindist = hor; 

133 vert = array[i1][i2- 1]->cost + verweight* dist; 

134 if (vert < mindist) { 

135 xdir = 0; 

136 ydrr = -1 ; 

137 mindist = vert; 

138 } 

139 diag = array[i1 - 1][i2- 1]->cost + diagweight * dist; 

140 if (diag < mindist) { 

141 xdir = -1; 

142 ydir = -1; 

143 mindist = diag; 

144 } 

145 aelt = array[i1][i2]; 

146 aelt-> cost = mindist; 

147 aelt-> xptr = xdir; 

148 aelt-> yptr = ydir; 

149 } 

150 } 
151 

152 bestscore = best score(array, lenVed, lenVec2); 

153 if (pathFiie) 

1 54 print_best _path(array, lenVed, lenVec2,pathFile); 
155 

156 

157 for (il = 0; iK lenVed; il + +) { 

158 for (i2 = 0; i2 < lenVec2; i2+ +) { 

159 free(array[i1J[i2]); 

160 ) 

161 } 
162 
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1 63 return(bestscore); 

164 } 
165 

166 

167 /* 

168 ********************* 

169 * faster parallel match 

170 * not optimal because warp is limited to swath of width "bw" 
********************* 

172 */ 
173 

174 float faster pl_DPDiff Pair(OutlinePair one, OutiinePair two, char *pathFile) 

175 /* 

176 * question, should top and bottom distance be forced to be computed together? 

1 77 * use another distance score to check how far off the two are? 

178 */ 
179 

180 { 

181 float score; 
182 

183 score = faster_p!_matchvecs(one->top, one-> bottom, one-> numberOfLegs, 

184 two- > top, two- > bottom, two- > numberOfLegs, 

185 pathFile); 

186 return (score); 

187 } 
188 

189 float faster_pLmatchvecs(float *Vec1t, float *Vec1b, int lenVed, 

1 90 " float *Vec2t, float * Vec2b, int lenVec2, 

191 char*pathFile) 

192 /* 

1 93 * Computes the best path between one and two. 

194 * Allows 2/1 expansion/compression only within a band 

195 */ 

196 { 

197 float dist, mindist, nor, vert, diag; 

198 float bestscore; 

199 float ratio; 

200 inti1,i2; 

201 intxdir,ydir; 

202 int beg, end, center; 

203 intb; /* pointer to border */ 

204 int border; /* width of border on right side of swath */ 
205 

206 elt *array[MAXSEQLENGTH][MAXSEQLENGTH); 

207 elt *aeit; 
208 

209 float infinity = 1.0e30; 

210 tntbw = 20; 
211 

212 ratio = (float)lenVed/ (float)1enVec2; 

213 border = (int) (ratio + .999999); 

214 /* if (debug) 

215 printffratio: %f\n\ ratio); 

216 */ /* initialize array */ 
217 
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218 for(i1 =0;I1 < lenVed; M + +){ 

219 for (i2 = 0; \2 < lenVec2; i2 + + ) { 

220 array[i1][i2] = (elt *) malloc(sizeof (elt)); 

221 if <arrayli1][i2] = = NULL) { 

222 fprintf(stderr, 

223 "Sorry, not enough space to malloc array elts in pi matchvecsVn BYE ! "); 

224 exit(1); 

225 } 

226 } 

227 } 
228 

229 /* 

230 * compute match 

231 */ 

232 /* initialize */ 

233 aelt = array[0][0J; 

234 aelt->cost = sq_distance(Vec1t[0], Vec2t[0]) + sq distance(Vedb[0] r Vec2b[01); 

235 aelt->xptr = 0; 

236 aelt->yptr = 0; 

237 /* bottom row*/ 

238 i2 - 0; 

239 end = bw + border + 1; 

240 for (i1 = 1; i1 <end; i1 + 4-){ 

241 dist =s sq_distance(Vec1t[i1] # Vec2t[i2]) + sq_distance(Vedb[i1] ( Vec2b[i2]); 

242 aelt = array[i1][i2]; 

243 ae!t->cost = array[i1 * 1][i2]->cost + horweight * dist; 

244 aelt->xptr = -1; 

245 aelt->yptr = 0; 

246 } 

247 /* 

248 * swath 
249 

250 * set the elt before beg and at end to infinity, then the compute distances normally 

251 * for the row 

252 */ 

253 for(i2 = 1; \2 < lenVec2; i2++){ 

254 center = 12 * ratio; 

255 beg = max(1, center - bw); 

256 end = min(lenVed, center + bw + 1); 

257 /* if (debug) 

258 printff center: %d, beg: %d # end: %d\n M , center, beg, end); 

259 */ /*beg*/ 

260 aelt = array[beg - 1][i2]; 

261 aelt->xptr = 0; 

262 aelt->yptr = -1; 

263 if (beg = = 1){ 

264 dist = sq_distance(Vec1t[0], Vec2t[i2]) + sq_distance(VedblO], Vec2b[i2)); 

265 aelt->cost = array[0][i2 - 1]->cost + verweight * dist; 

266 } 

267 else { 

268 aelt- > cost = infinity; 

269 } 

270 /*end*/ 

271 /* if(end<lenVed){ 

272 */ 
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273 for(b = end; b < min(end + borderJenVed); b + +){ 

274 /* if (debug) 

275 printf("b: Vod w ,b); 

276 */ aelt = array[bl[i2]; 

277 aelt- > cost = infinity; 

278 ae!t->xptr = -1; 

279 aelt->yptr = 0; 

280 } 

281 for (il = beg;i1 < end; H + +){ 

282 dist = sq_distance(Vedt[i1],Vec2t[i2]) + sq_distance(Vec1b[i1], Vec2b[i2]); 

283 hor = array[i1 - 1][i2]->cost + nonweight * dist; 

284 xdir = -1; 

285 ydir = 0; 

286 mindist = hor; 

287 vert = array(i1][i2 - 1]->cost + verweight * dist; 

288 if (vert < mindist) { 

289 xdir = 0; 

290 ydir = -1; 

291 mindist = vert; 

292 } 

293 diag = array[i1 - 1](i2 - 1]->cost + diagweight * dist; 

294 if (diag < mindist) { 

295 xdir = -1; 

296 ydir = -1; 

297 mindist = diag; 

298 } 

299 aelt = array[i1l[i2]; 

300 aelt-> cost = mindist; 

301 aelt -> xptr = xdir; 

302 aelt-> yptr = ydir; 

303 } 

304 } 
305 

306 bestscore = best score(array, lenVed, lenVec2); 

307 if (pathFile) 

308 print_best_path (array, lenVed , lenVec2 f pathFile); 
309 

310 for (i1 = 0; il <lenVed; i1 + +){ 

311 for (i2 = 0;i2<lenVec2;i2++){ 

312 free(array[i1](i2]); 

313 } 

314 } 
315 

316 return(bestscore); 

317 } 
318 

319 

320 /* 

321 * ******************** 

322 * fastest parallel match 

323 * warp limited to swath bw, plus no backtracking 

324 * ******************** 

325 */ 
326 

327 float simple jDlJDPDiffPair(OutlinePair one, OutlinePair two) 
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328 /* 

329 * question, should top and bottom distance be forced to be computed together? 

330 * use another distance score to check how far off the two are? 

331 */ 
332 

333 { 

334 float score; 
335 

336 score = simplejsljnatchvecsfone^top, one->bottom, one->numberOfLegs, 

337 two->top,two-> bottom, two->numberOf Legs); 

338 return (score); 

339 } 
340 

341 void PrintArrayRow(f loat *array,int width) 

342 { 

343 int i; 

344 f or (i = 0; i < width; + + i) 

345 printf("°/o2.2f \*array+ + ); 

346 printf("\n B ); 

347 } 
348 

349 void PrintArray(float *array,int heightjnt width,intsignalWidth) 

350 { 

351 int i; 

352 for (i= 0; i < height; + + i) { 

353 printfr%d: w ,i); 

354 PrintArrayRow(array+i*width,signalWidth); 

355 } 

356 } 
357 

358 float simpie_pl_matchvecs(float *Vec1t, float *Vec1b, int lenVed, float *Vec2t, float 
*Vec2b, int lenVec2) 

359 /* 

360 * Computes the best path between one and two within a band. 

361 * Allows 2/ 1 expansion/compression only within a band. 

362 */ 

363 { 

364 float dist, mindist, nor, vert, diag; 

365 float bestscore; 

366 float ratio; 

367 inti1,i2; 

368 intxdir.ydir; 

369 int beg # end, center; 

370 int b; /* pointer to border */ 

371 int border; /* width of border on right side of swath */ 
372 

373 float array[MAXSEQLEN6TH][MAXSEQLEN6TH]; 
374 

375 float infinity = 1 .0e30; 

376 intbw = 20; 
377 

378 ratio = (float)lenVec1/ (f!oat)lenVec2; 

379 border = (int) (ratio + .999999); 

380 /* if (debug) 

381 printff ratio: %f\n", ratio); 
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382 */ /* initialize array*/ 

383 

384 

385 /* 

386 * compute match 

387 */ 

388 /* initialize */ 

389 array[0][0] = sq distance(Vedt[0], Vec2t[0]) + sq_distance(Vedb[0], Vec2b[0]); 
390 

391 /* bottom row*/ 

392 i2 = 0; 

393 end = bw + border + 1; 

394 for (i1 = 1; i1 < end; il + +) { 

395 dist = sq_distance(Vec1t(i1], Vec2t(i2]) + sq_distance(Vedb[i1], Vec2b[i2]); 

396 array[i1](i2] = arraylil - 1][i2] + horweight * dist; 

397 } 

398 /* 

399 * swath 

400 * 

401 * set the elt before beg and at end to infinity, then the compute distances normally 

402 * for the row 

403 */ 

404 for (i2 = 1; i2 < lenVec2; i2++){ 

405 center = i2 * ratio; 

406 beg = max(1, center - bw); 

407 end = min(lenVec1, center + bw + 1); 

408 /* if (debug) 

409 printf( M center: %d, beg: %d, end: VoriVn", center, beg, end); 

410 */ /*beg*/ 

411 if(beg==1){ 

412 dist = sq_distance(Vedtl0], Vec2t[i2]) + sq_distance(Vedb(0], Vec2b[i2]); 

413 array[beg- 1][i2] = array[0][i2- U + verweight * dist; 

414 } 

415 e!se{ 

416 array[beg- 1J(i2] = infinity; 

417 } 

418 /*end*/ 

419 for(b ss end; b < min(end + border,lenVed); b + +){ 

420 /* if (debug) 

421 printfCb: %d b); 

422 */ 

423 array[b][i2l = infinity; 

424 } 

425 for 01 = beg; il < end; i1 + +){ ' 

426 dist = sq_distance(Vedtli1], Vec2t|i2]) + sq_distance(Vedb[i1], Vec2b(i2J); 

427 hor = arraylil - 1}[i2) + horweight * dist; 

428 mindist = hor; 

429 vert = array(il][i2 - 1] + verweight * dist; 

430 if (vert < mindist) { 

431 mindist = vert; 

432 ) 

433 * diag = arraylil -1][i2- 1] + diagweight* dist; 

434 if (diag < mindist) { 

435 mindist = diag; 

436 } 
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437 arrayli1][i2] = mindist; 

438 } 

439 } 
440 

441 bestscore = arrayflenVed - 1J[lenVec2 - 1 J; 

442 if (debug) { 

443 printfC best score: %f\n n , bestscore); 

444 } - 
445 

446 return(bestscore); 1 

447 } 
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Jul 24 17:16 1991 newL2.c 



1 #include <stdio.h> 

2 #include <math.h> 

3 #include "boolean.h" 

4 #include M types.h w 

5 #include "error.h" 

6 #include M dict.h M 
7 

8 #def ine NORMAL.LENGTH (100.0) 

9 #defineMAX_SLOPE(2.0) 

10 #defineB!G NUM(10.0e20) 
11 

12 void ResampleOutlinePair(OutlinePaira,floatnewToOldFactor) 

13 /* Resarnple an outline pair using linear interpolation. */ 

14 { ■ ■ 

1 5 int newWidth,oldWidth,i; 

16 intoldLeft,o!dRight; 

17 float oldCenter; 

18 float *newX,*newTop ( *newBottom; 

19 float *xCursor,*topCursor,*bottomCursor; 
20 

21 oldWidth = a->numberOfLegs; 

22 newWidth = irint(newTo01dFactor*oldWidth); 
23 

24 newX = (float *)calloc(newWidth,sizeof{float)); 

25 newTop = (float *)calloc(newWidth,sizeof(float)); 

26 newBottom = (float *)calloc(newWidth,sizeof<float)); 

27 if ((newX= =NULL)||(newTop= = NULL)||(newBottom = — NULL)) 

28 DoError( w ResampleOutlinePair: cannot allocate space.\n\NULL); 
29 

30 xCursor = newX; 

31 topCursor = newTop; 

32 bottom Cursor = newBottom; 
33 

34 for (i = 0; i < newWidth; + + i) { 

35 oldCenter = i/(float)newWidth*(float)oldWidth; 

36 oldLeft = irint(floor(o(dCenter)); 

37 oldRight = irint(ceil(oldCenter)); 

38 if (oldLeft == oldRight) { 

39 *xCursor+ + = *(a->x+ oldLeft); 

40 *topCursor + + = *(a-> top + oldLeft); 

41 *bottomCursor + + = *(a-> bottom + oldLeft); 

42 } . 

43 else { 

44 float slope; 

45 slope = *(a->x+oldRight)-*(a->x+oldLeft); 

46 *xCursor+ + = *(a->x-f oldLeft) + (oldCenter-oldLeft)*slope; 

47 slope = *(a->top+oldRight)-*(a->top+ oldLeft); 

48 *topCursor+ + = *(a->top + oldLeft) + (oldCenter-oldLeft)*slope; 

49 slope = *(a->bottom + oldRight)-*(a->bottom + oldLeft); 

50 *bottomCursor+ + = *(a-> bottom + oldLeft) + (oldCenter-oldLeft)*slope; 

51 } 

52 } 



10/24/2003, EAST version: 1.4.1 



5,491,760 

237 238 

Section C APPENDIX / Page 100 

53 

54 free(a->x); 

55 free(a->top); 

56 free(a->bottom); 
57 

58 a->x = newX; 

59 a->top = newTop; 

60 a- > bottom = newBottom; 

61 a- > n umberOf Legs = newWidth; 

62 } 
63 

64 float L2Norm(OutlinePair signal, int startOffset, 

65 OutlinePair model/float topToBottom) 

66 { 

67 float *top1,*top2,*bottom1,*bottom2; 

68 int ^overlap; 

69 float sum; 

70 float temp; 
71 

72 if ((startOffset <0) || 

73 (startOffset + model- >numberOf Legs > signal-> numberOf Legs)) 

74 DoErrorfL2Norm: the model must overlap the signal.\n a ,NULL); 

76 topi = signal->top+startOffset; 

77 top2 = model->top; 

78 bottoml = signal->bottom+startOffset; 

79 bottom2 = model- > bottom; 
80 

81 overlap = signal- > numberOf Legs- startOffset; 

82 if (overlap > model- >numberOf Legs) 

83 overlap = model- >numberOf Legs; 

85 f o r (i = 0,su m = 0; i < overlap; + + i) { 

86 temp = *top 1 + + - * top2 + + ; 

87 sum + = temp * temp * topToBottom; 

88 temp = *bottom1 + + - *bottom2+ +; 

89 sum + =5 temp * temp; 

90 } 
91 

92 return sum; 

93 } 
94 

95 |loatL2Compare(OuttinePairo1 # OutlinePairo2.floattopToBottom) 

97 float slope = (float)o1-> width/(float)o2-> width; 

98 if ({slope>MAXJLOPE)||(1/siope>MAX SLOPE)) 

99 return BI6_NUM; 

10Q if (o1->numberOfLegsl = NORMALJ.ENGTH) 

101 Resamp!eOutlinePair(o1,NORMALjENGTH/o1->numberOfLegs); 

1 02 if (o2- > numberOf Legs i = NORMAL J.ENGTH) 

103 ResampleOutlinePair(o2,NORMAL LENGTH/o2-> numberOf Legs)* 

104 return L2Norm(o1,0,o2,topToBottom); 

105 } 
106 
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Aug 14 20:541991 newMatch.c 



1 #include <stdio.h> 

2 #include "mylib.h" 

3 #include "misch* 1 

4 #indude "types.h" 

5 #indude "dicth" 

6 #include M newMatch.h" 
7 

8 #define M AX_S 1 6 N A L_LE NGTH (800) 

9 #define MAX_SLOPE (2.0) 

10 #define BIG.NUM (10e20) 
11 

12 typedef enum {NONE^LEFT^OWN^DOWNLEFLOI L1,D2L1,D1L2} Direction; 
13 

14 extern double sqrt(double); 

15 extern double cos(double); 

1 6 extern double atan(double); 

17 extern int irint(double); 
18 

19 /* Assumes that a represents the model and b represents the unknown. 

20 * Weights places where the model is lower than the unknown more than 

21 * cases where the model is higher than the unknown. The idea here is 

22 * that valleys can be filled in by bleeding together, but that noise 

23 * can rarely make a contour be too tall for extended periods. 

24 */ 

25 float hillToValley = 1.0; 

26 inline float SquareDifference(f loat a,float b) 

27 { 

28 float temp = a-b; 

29 if (temp <0) 

30 return temp*temp; 

31 else 

32 return temp*temp*hillToValley*hillToValley; 

33 /* return (a-b)*(a-b); */ 

34 } 
35 

36 inline float FMax(f loat a,f loat b) 

37 { 

38 if(a>b) 

39 return a; 

40 else 

41 return b; 

42 } 
43 

44 inline float FMin(float a,fioat b) 

45 { 

46 if (a < b) 

47 return a; 

48 else 

49 return b; 

50 } 
51 

52 inline int IMax(int a.int b) 
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53 { 

54 if (a>b) 

55 return a; 

56 else 

57 return b; 

58 ] 
59 

60 inline int IMin(int a,int b) 

61 { 

62 if(a<b) 

63 return a; 

64 else 

65 return b; 

66 } 
67 

68 float NewMatch(float *a1,f loat *a2,int aLength.float *b1,f loat *b2,int bLength, 

^ 9 f loat centerWeight,BOOLEAN lengthNormalize,int normalBandWidth, 

70 float topToBottom) 

71 { 

72 float costsO[MAX__SIGNAL_LENGTH + 1]; 

73 float costsl [MAX_SIGNAL_LENGTH + 1J; 

74 int ij,start,end,bandWidth,shift; 

75 int realStart,realEnd,center,oldEnd; 

76 float slope,angle; 

77 float *a1c,*a2c,*cd,*cl,*cdl i *dc; 

78 float oldCost,b1v,b2v,returnVai; 
79 

80 if (aLength>MAX_5IGNAL_LENGTH||bLength>MAX_SIGNAL_LENGTH) 

81 DoError("NewMatch; maximum signal length exceeded.\n" J NULL); 
82 

83 slope = (float)aLength/(float)bLength; 
84 

85 if ((slope >MAXJLOPE)||(1 /slope > MAX SLOPE)) { 

86 return BIG NUM; 

87 } 

88 angle = atan(slope); 

89 bandwidth = irint(normalBandWidth/cos(angle)); 

90 center = 0; 

91 realStart = center-bandWidth/2; 

92 realEnd = reaiStart+ bandwidth; 

93 end = FMin(realEnd,aLength); 
94 

95 a1c = al; /*a1 cursor*/ 

96 a2c = a2; /* a2 cursor */ 

97 b1v = *b1; /*b1 value*/ 

98 b2v = *b2;/*b2 value*/ 

99 dc = costsO; 

100 *dc+ + = BIG_NUM; 

101 oldCost = *dc++ = 

102 Sc i uareDiffe f er >ce(*a1c+ + < b1v)*topToBottom+SquareDifference{*a2c+ + ,b2v); 

103 for(j=1;j<end; + +j) 

104 oldCost = *dc+ + = 

105 oJdCost+SquareDiffer ^ 



10/24/2003, EAST Version: 1.4.1 



5,491,760 
Section C 



244 

APPENDIX / Page 103 



1 06 f or (i = 1 ; i < bLength; + + i){ 

107 /* Compute new center of band */ 

108 r center = irint(slope*i); 

109 realStart = center-bandWidth/2; 

110 realEnd = realStart + bandwidth; 

1 1 1 start = FMax(rea!Start,0); 

112 oldEnd = end; 

1 13 end = FMin(realEnd,alength); 

114 shift = end-oldEnd; 
115 

116 /* put large numbers where bands don't overlap */ 

117 for(j = 0;j<shift; + +j) 

118 *dc+ + = BIGJMUM; 
119 

120 a1c= a1+start; /*a1 cursor*/ 

121 a2c = a2-f start; /* a2 cursor */ 

122 b1v = *(b1 /* b1 value */ 

123 b2v = *(b2+i);/*b2 value*/ 

124 . if(i&1){ 

125 cd = costs1+start-1 + 1; /* cursor down ??? What about -1??? */ 

126 cdl = costs0+start-1 + 1; /* cursor down left */ 

127 , cl = costsO + start -f 1; /* cursor left */ 

128 dc = costs1+start+1; /* destination cursor */ 

129 } 

130 else{ 

131 cd = costsO + start-1 + 1; /* cursor down */ 

132 cdl = costs1+start-1 + 1; /* cursor down left */ 

133 cl = costs1+start+1; /* cursor left */ 

134 dc = costs0 + start+1; /* destination cursor */ 

135 } 

136 *cd = BIG.NUM; 

137 for (j = start; j<end; + +j){ 

138 float down,left,downLeft,rest; 

139 down = *cd + + + rest; 

140 left= *d++ + rest- 
Mi downLeft = *cdl+ + + rest*centerWeight; 

142 rest = SquareDifference(*a1c+ + ,b1v)*topToBottom+ 

143 SquareDifference(*a2c+ + ,b2v); 

144 *dc++ = FMin(FMin(down,left),downLeft); 

145 } 

146 } 
147 

148 i~; 

149 if(i&1) 

150 dc = costs1+aLength-1 + 1; 

151 else 

1 52 dc = costsO + aLength-1 + 1 ; 

153 returnVal = *dc; 
154 

1 55 if (lengthNormalize) 

156 return returnVal/sqrt(al_ength*aLength+bLength*bLength); 

157 else 

158 return returnVal; 

159 } 
160 
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161 

162 float SepMatch(float*a1 l intaLength,float*b1 i intbLength r 

163 float centerWeight,BOOLEAN lengthNormalizeJnt normalBandWidth) 

164 { 

1 65 float costtO(MAX_SIGNAL_LENGTH + 1]; 

166 float costs1{MAX_SIGNAL_LENGTH + 1]; 

1 67 int i j,start,end,band Width,shift; 

168 int realStart,realEnd,center,oldEnd; 

169 float slope,angle; 

170 float *a1c,*cd,*cl.*cdl,*dc; 

171 float oldCost,b1v,retumVal; 
172 

173 if(aLength>MAX_SIGNAL.LENGTH||bLength>MAX_SIGNAL.LENGTH) 

174 DoErrorCNewMatch: maximum signal length exceededAn'^ULL); 
175 

176 slope = (float)aLength/(float)bLength; 
177 

178 if ((slope>MAX.SLOPE}||(1/slope>MAX_SLOPE)) { 

179 return BIG NUM; 

180 } 

181 angle = atan(slope); 

182 bandWidth = irint(normalBandWidth/cos(angle)); 

183 center = 0; 

184 realStart = center-band Width/2; 

185 realEnd = realStartH- bandwidth; 

186 end = FMin(realEnd,aLength); 
187 

188 a1c = a1; /* a1 cursor*/ 

189 b1v = *b1;/*b1 value*/ 

190 dc=costs0; 

191 *dc+ + = BIGJslUM; 

192 oldCost = *dc+ + = SquareDifference(*a1c + + ,b1v); 
193 

194 for(j = 1;j<end; + +j) 

195 oldCost = *dc+ + = oldCost+SquareDifference(*a1c+ + ,b1v); 
196 

1 97 f or (i = 1 ; i < bLength ; + + i) { 

1 98 /* Compute new center of band */ 

199 center = irint(siope*i); 

200 realStart = center-bandWidth/2; 

201 realEnd = realStart + band Width ; 

202 start = FMax(real$tart,0); 

203 oldEnd = end; 

204 end = FMin(realEnd f aLength); 

205 shift = end-oldEnd; 
206 

207 /* put large numbers where bands don't overlap */ 

208 for(j = 0;j<shift;+ +j) 

209 *dc+ + = BIG NUM; 
210 

211 a1c = a1+start; /* a1 cursor*/ 

212 b1v = *(b1+i);/*b1 value*/ 

213 if(i&1){ 

214 cd = costsl+start-1-M; /* cursor down ??? What about -1??? */ 

215 cdl = costs0 + start- 1-M; /* cursor down left */ 
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216 cl = costsO + start+1; /* cursor left */ 

217 dc = costsl +start+1; /* destination cursor */ 

218 } 

219 else{ 

220 cd = costs0+start-1 + 1; /* cursor down */ 

221 cdl = costsl + start-1 + 1; /* cursor down left */ 

222 d = costsl + start* 1; /* cursor left */ 

223 dc = costs0+ start + 1; /* destination cursor */ 

224 } 

225 *cd = BIGJMUM; 

226 for (j = start; j < end; + + j) { 

227 float down ,left,downLeft,rest; 

228 down = *cd+ + + rest; 

229 . left = *cl++ + rest; 

230 downLeft = *cdl + + + rest*centerWeight; 

231 rest = SquareDifference(*a1c+ -f ,b1v); 

232 *dc+ + = FMin(FMin(down,ieft),downLeft); 

233 } 

234 } 
235 

236 i~; 

237 if(i&1) 

238 dc = costsl + aLength-1 + 1; 

239 else 

240 dc = costs0+aLength-1 + 1; 

241 returnVal = *dc; 
242 

243 if (lengthNormalize) 

244 return returnVal/sqrt(aLength*aLength+bLength*bLength); 

245 else 

246 return returnVal; 

247 } 
248 

249 #def ine WIDTH (800) 

250 #def ine H .MARGIN (20) 

251 #defineV MARGIN (40) 

252 #def ine H SPACING (20) 

253 #def ine V SPACING (100) /* Must be greater than 2*X HEIGHT */ 

254 #defineX_HEIGHT(17) 

255 void DrawVline(Picturepict,intx,intyt ( int yb) 

256 { - 

257 int I; 

258 for(i = yt;i<yb; + + i) 

259 WritePixel(pict,x,U); 

260 } . 
261 

262 void DrawOutline(Picture pict,int numberOfLegs,float *tops,f loat *bottoms ( int x,int y) 

263 { 

264 int i,top,bottom; 

265 for (i = 0; i < numberOf Legs; + -H) { 

266 top = irint(-*(tops+i)*X_HEIGHT); 

267 bottom = irint(*(bottoms + i)*X_HEIGHT + X_H EIGHT); 

268 DrawVLine(pict,i + x,top +y, bottom + y); 

269 } 
270. } 
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271 

272 voidPrintPath(ftoat*a1,f^ 

273 Direction path[MAXJIGNAL_L£NGTH][MAX SIGNAL LENGTH],int i, 

274 float returnVal, 

275 FILE *pathFP) 

276 { 

277 intx,y,j; 

278 intlength = 0; 

279 int index = 0; 

280 float newTop 1 [MAX_SIGNAL_LENGTH],newBottom 1 [MAX SIGNAL LENGTH]; 

281 float newTop2[MAX_SIGNALJ.ENGTH],newBottom2[MAX S1GNAL~LENGTH1; 

282 " 

283 y = i; 

284 x = aLength-1; 

285 while (path[y][x]! = NONE) { 

286 switch (path [y][x]){ 

287 case DOWN: 

288 x»; 

289 break; 

290 case LEFT: 

291 y- ; 

292 break; 

293 caseD1L1: 

294 case DOWNLEFT: 

295 x-; 

296 y-; 

297 break; 

298 caseD2L1: 

299 x- = 2; 

300 y--; 

301 break; 

302 caseD1L2: 

303 x-; 

304 y-=2; 

305 break; 

306 default: 

307 DoError("NewMatehAndPath: Internal error - bad case.\n u ( NULL); 

308 } 

309 ++ length; 

310 } 
311 

312 y = i; 

313 x = aLength-1; 

314 while (path[y][x]f — NONE) { 

315 if (index > = MAX_SIGNAL_LENGTH) 

316 DoErrorCNewMatchAddPath: warped signal is too long.\n w ,NULL); 

317 newTopI [length-index] = a1[xj; 

318 newBottoml [length-index] = a2[x]; 

319 newTop2[length-index] = b1[y]; 

320 newBottom2[length-index] = b2[yl; 

321 switch (path[yj[x]) { 

322 case DOWN: 

323 x«; 

324 break; 

325 case LEFT: 
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326 y«; 

327 break; 

328 caseDIM: 

329 case DOWN LEFT: 

330 x»; 

331 y~; 

332 break; 

333 caseD2L1: 

334 x- = 2; 

335 y-~; 

336 break; 

337 caseD1L2: 

338 x»; 

339 y- = 2; 

340 break; 

341 default: 

342 DoError("NewMatchAndPath: Internal error - bad case.\n*\NULL); 

343 } 

344 + + index; 

345 } 

346 if (index> = MAX_SIGNAL_LENGTH) 

347 DoError(*NewMatchAddPath: warped signal is too long.\n*\NULL); 

348 newTopI [length-index] = a1[x]; 

349 newBottomlllength-index] ~ a2[x]; 

350 newTop2[length-index] = blly); 

351 newBottom2[length-index] = b2[y]; 

352 + + index; 
353 

354 f o r (j = 0; j < index; + + j) 

355 fprintf(pathFP,"%d %f\n M ,j,newTop1[j]); 

356 fprintf(pathFP,*Vtop1\n\n"); 
357 

358 for(j=0;j<index; + +j) 

359 fprintf<pathFP, M %d 0 /of\n'\j,newTop2[j]); 

360 fprintf(pathFP/\"top2\n\n M ); 
361 

362 for(j = 0;j<index; + +j) 

363 fprintf(pathFP l ,, %d%f\n" I j,-newBottom1Ij]); 

364 fprintftpathFP/V^bottomlVnVn"); 
365 

366 for(j = 0;j<index; + +j) 

367 fprintf(pathFP/%d%f\n M f j r -newBottom2[j]); 

368 fprintf(pathFP/\ H bottom2\n\n"); 
369 

370 { 

371 Picture pict; 

372 pict - 

new pict(IMax(index,IMax(aLength / bLength))*2 + H^SPACING + H_MARGIN*2,V_MARGIN* 
2 + 2*V_SPACING,1); 

373 DrawOutline(pict,aLength ( a1,a2 ( H.MARGIN l V.MARGIN); 

374 DrawOutline(pict,bLength,b1,b2 J H_MARGIN+aLength+H.SPACING,V_MARGIN); 
375 

376 DrawOutlinetpictJndex.newTopl^newBottoml^H.MARGIN^V.MARGJN + VJPACiNG); 
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377 

DrawOutiine(pictJndex,newTop2 ( newBottom2 J H_MARGIN + index +H SPACING^ MARGI 
N+V.SPACING); 

378 DrawOutline(pictJndex,newTop2jiewBottom2,H MARGIN,V MARGIN+V SPACING*2); 

379 wfite_pict("out.pict u ,pict); 

380 } 
381 

382 { 

383 float checksum; 

384 fprintf(pathFP,"%d %f\n",0,checksum); 

385 for (j-0 ( checksum=0;j < index; + +j) { 

386 checksum + = SquareDifference(newTop1 [j] f newTop2(j]) + 

387 SquareDiff erence(newBottom1 Ij],newBottom2ljJ); 

388 fprintffpathFP/^d %f\n",j,checksum); 

389 } 

390 printf( M checksum, score = %6.2f, % 6.2f\n*\ checksum, ret urn Va I); 

391 } 

392 } 
393 

394 
395 

396 float NewMatchAndPath(f loat *a1,f loat *a2,int aLength,ffoat *b1 .float *b2,int bLength, 

397 float centerWeightBOOLEAN lengthNormalizeJnt norma iBandWidth, 

398 float topToBottom,FILE *pathFP) 

399 { 

400 Direction path [M AX_SIG N ALJ.ENGTH] [M AX_5IG N AL LENGTH] f *pc; 

401 intx,y; 

402 float costsO[MAX_SIGNAL_LENGTH + 1); 

403 float costs 1 [MAX_SlGNAL_LENGTH + 1]; 

404 int ijfStart.end^andWidtr^shift; 

405 int realStart,realEnd,center,oIdEnd; 

406 float slope,angle; 

407 float *a1c/a2c,*cd,*cl,*cdl,*dc; 

408 float oldCost,b1v,b2v,returnVal; 
409 

410 if (aLength > MAX_SIGNAL_LENGTH||bLength > MAX_SIGNAL_LENGTH) 

41 1 DoError("NewMatch: maximum signal length exceeded.\n",NULL); 
412 

413 slope = (f loat)aLength/(f loat)bLength; 
414 

415 if ((slope > M AX JLOPE)||(Vslope> MAX SLOPE)) { 

416 return BIG NUM; 

417 } 

418 angle = atan{slope); 

419 bandwidth = irint(normalBandWidth/cos(angle)); 

420 center = 0; 

421 realStart = center-bandWidth/2; 

422 realEnd = realStart + bandwidth; 

423 end = FMin(realEnd,aLength); 
424 

425 a1c = a1; /* a1 cursor*/ 

426 a2c = a2; /* a2 cursor */ 

427 b1v = *b1;/*b1 value*/ 

428 b2v = *b2;/*b2 value*/ 
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429 dc = costsO; 

430 pc = &(path[0][0]); 

431 *dc+ + = B!G_NUM; 

432 oldCost = *dc+ + = 
SquareDifference(*a1c++,b1v)*topToBottom + SquareDifference(*a2c+ + ,b2v); 

433 *pc+ + = NONE; 
434 

435 for(j=1;j<end; + +j){ 

436 oldCost = *dc+ + = 
oldCost+SquareDifference(*a1c++,b1v)*topToBottom + SquareDifference(*a2c+ + ,b2v); 

437 *pc+ + = DOWN; 

438 } 
439 

440 #ifdeffoo 

441 printfr%6d",0); 

442 for (j = 0;j<end; 4- +j) 

443 printf( M %6.2f \costs0( j + 1 ]); 

444 #endif 
445 

446 for(i = 1;i<bLength; + +i){ 

447 /* Compute new center of band */ 

448 center = irint(slope*i); 

449 realStart = ce nter- band Width/2; 

450 realEnd = realStart + bandwidth; 

451 start = FMax(rea!Start,0); 

452 oldEnd = end; 

453 end = FMin(realEnd,aLength); 

454 shift = end-oldEnd; 



455 
456 
457 
458 
459 
460 
461 
462 
463 
464 
465 
466 
467 
468 
469 
470 
471 
472 
473 
474 
475 
476 
477 
478 
479 
480 
481 



/* put large numbers where bands don't overlap */ 
for(j = 0;j<shift; + +j){ 
/* printf( M °/o6.2f '\BIG_NUM); */ 
*dc+ + = BIG.NUM; 



/* printf( M \n%6d ",!);*/ 



a1c = al+start; /*a1 cursor*/ 
a2c = a2+start; /* a2 cursor */ 
b1v = *(b1 + i);/*b1 value*/ 
b2v = *(b2 + i);/* b2 value*/ 
pc = &(path[i] [start]); 
if(i&1){ 

cd = costsl + start- 1 + 1; /* cursor down ???What about -1???*/ 

cdl ss costs0 + start- 1 + 1; /* cursor down left */ 

cl = costs0+start+1; /* cursor left */ 

dc= costsl + start* 1; /* destination cursor */ 



else{ 

cd = costs0+start-1-M; /* cursor down */ 
cdl = costsl +start-1 + 1; /* cursor down left*/ 
cl = costsl + start* 1; /* cursor left */ 
dc = costsO-f start +1; /* destination cursor */ 



*cd = BIG_NUM; 

for (j =start;j <end; + +j) { 
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482 float down,left,downLeft,rest; 

483 rest = SquareDifference(*a1c+ + l b1v)*topTo8ottom + 
SquareDifferenee(*a2c+ + ,b2v); 

484 down = *cd + + + rest; 

485 left = *cl + + + rest; 

486 downLeft = *cdl+ + + rest*centerWeight; 
487 

488 if (down < left) 

489 if (down < downLeft) { 

490 /* printf("%6.2f" # down); */ 

491 *dc+ + = down; 

492 '*pc++=DOWN; 

493 } 

494 else{ 

495 /* printf( H %6.2f '\downLeft); */ 

496 *dc+ + =s downLeft; 

497 *pc+ + = DOWNLEFT; 

498 } 

499 else 

500 if (downLeft < left) { 

501 /* printf("%6.2f M ,downLeft); */ 

502 *dc+ + = downLeft; 

503 *pc++ = DOWNLEFT; 

504 } 

505 else{ 

506 /* printf("%6.2f Meft);*/ 

507 *dc++=left; 

508 *pc++=LEFT; 

509 } 

510 } 

511 } 
512 

513 i-; 

514 if(i&1) 

515 dc= costs1+aLength-1 + 1; 

516 else 

517 dc = costsO+aLength-1 + 1; 

518 returnVal = *dc; 
519 

520 #ifdeffoo 

521 if(!doPath){ 

522 y = i; 

523 x = aLength-1; 

524 while (path|y][x]l = NONE) { 

525 switch (pathlylM) { 

526 case DOWN: 

527 x--; 

528 break; 

529 case LEFT: 

530 y~; 

531 break; 

532 case DOWNLEFT: 

533 x-; 

534 y~; 

535 break; 
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536 default: 

537 DoErrorfNewMatchAndPath: Internal error- bad case.\n w ,NULL); 

538 } 

539 fprintf(pathFP/ > %d %d\n°,x,y); 

540 } 

541 fprintf(pathFP,"%d %d\n",x,y); 

542 } 

543 else 0 

544 #endif 

545 PrintPath(a1,a2 r aLength,b1 ( b2,bLength l pathJ,returnVa! i pathFP); 
546 

547 if (lengthNormalize) 

548 return returnVal/sqrt(a Length* a Length + bLength*bLength); 

549 else 

550 return returnVal; 

551 } 
552 

553 

554 float SlopeCMatch(float*a1, float *a2Jnt aLength,f loat *b1, float *b2,intbLength, 

555 float centerWeight.BOOLEAN lengthNormalize,f loat topToBottom) 

556 { 

557 float costsO[M AX_SIGNAL_LENGTH + 2]; 

558 float costs1[MAX_SIGNAL_LENGTH + 2]; 

559 float costs2[MAX_SIGNAL_LENGTH + 2]; 

560 float slope, minVal; 

561 int ij; 

562 int bottom, top; 

563 float *cd111 l *cd2H *cdl!2; 

564 float *a1c,*a2c,*cd,*c!,*cdl,*dc; 

565 f loat b 1 v,b2 v,retu rnVa I; 
566 

567 /* prtntf("sc:\n*); */ 
568 

569 if (aLength>MAX_SIGNAL_LENGTH||bLength>MAX_SIGNAL_LENGTH) 

570 DoErrorC'SlopeCMatch: maximum signal length exceeded.\n",NULL); 
571 

572 slope = (float)aLength/(float)bLength; 

573 if ((slope > MAX JLOPE)||(1 /slope >MAX_S LOPE)) { 

574 return BIG NUM; 

575 } 
576 

577 for(i = 0;i<aLength + 2; + + 

578 costs2[i] = BIGJIUM; 

579 costs1[i] = BIG_NUM; 

580 costs0[i] = BIG_NUM; 

581 } 
582 

583 costs0[2] = SquareDifference(*a1 l *b1)*topToBottom + SquareDifference(*a2,*b2); 
584 

585 for(i=1;i<bLength; + +i){ 

586 bottom = IMax(i/2,2*i + aLength-2*bLength); 

587 top = IMin(2*i,i/2+aLength-bLength/2) + 1; 
588 

589 a1c = a1+bottom; /* al cursor */ 

590 a2c = a2 + bottom; /* a2 cursor*/ 
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591 b1 v = *{b1 + 1); /* b1 value */ 

592 b2v = *<b2+i); /* b2 value */ 
593 

594 switch (\% 3) { 

595 case 0: 

596 dc = costsO + bottom-2 + 2; 

597 cd2H = costs2+ bottom-2 +2; 

598 cd1l2 = costsl + bottom-1 + 2; 

599 cd 1 1 1 = costs2 + bottom- 1 + 2; 

600 break; 

601 easel: 

602 dc = costsl + bottom-2 + 2; 

603 cd2H = costsO + bottom-2 +2; 

604 cd 1 12 = costs2 + bottom-1 + 2; 

605 cdlll = costsO + bottom-1 + 2; 

606 break; 

607 case 2: 

608 dc = costs2 + bottom-2 + 2; 

609 cd2J1 = costsl + bottom-2+2; 

610 cd1l2 = costs0+ bottom-1 +2; 

611 cd 1 1 1 = costsl + bottom-1 + 2; 

612 break; 

613 } 

614 *dc++ = BIG_NUM; 

615 *dc++ = B1G_NUM; 

616 for (j = bottom;] < top; + +j) { 

617 float d2H,d1l2,d1M,rest; 

618 rest = SquareDifference(*a1c+ + # b1v)*topToBottom + 

619 5quareDifference(*a2c+ + ,b2v); 

620 dlM = *cd1H + + + rest*centerWeight; 

621 d1l2 = *cdl!2 + + + rest; 

622 d2l1 = *cd2H + + + rest; 
623 

624 *dc++ = FMin(FMin(dll1,d2M),d1l2); 

625 } 
626 

627 switch (i°/o3){ 

628 case 0: 

629 dc = costsO; 

630 break; 

631 easel: 

632 dc = costsl; 

633 break; 

634 case 2: 

635 dc = costs2; 

636 break; 

637 } 
638 

639 #ifdeffoo 

640 minVal = BIG NUM; 

641 printfr%6d: ",i); 

642 for (j =0;j <aLength + 2; + + j) { 

643 if(*dc<=minVal) 

644 minVal = *dc; 

645 if (*dc++ >= BIGJvlUM) 
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646 printf( M ■); 

647 else 

648 printfC 1 *"); 

649 } 

650 printf( u %6.2f\n M ,minVal); 

651 #endif 

652 } 
653 

654 -i; 

655 switch (i%3){ 

656 case 0; 

657 dc = costsO; 

658 break; 

659 easel: 

660 dc = costs 1; 

661 break; 

662 case 2: 

663 dc = costs2; 

664 break; 

665 } 

666 returnVal = *(dc+alength-1 + 2); 
667 

668 if (lengthNormalize) 

669 return returnVaI/sqrt(aLength*aLength + bLength*bLength); 

670 else 

671 return returnVal; 

672 } 
673 

674 

675 float SepSlopeCMatch(f loat *a1,int al_ength,float *b1,int bLength, 

676 float centerWeight,BOOLEAN lengthNormalize) 

677 { 

678 float costs0[MAX SIGNAL LENGTH + 2]; 

679 float costs! [MAX SIGNALJ.ENGTH + 2]; 

680 float costs2[MAX_SIGNAL_LENGTH + 2]; 

681 float slope,minVal; 

682 intij; 

683 int bottom/top; 

684 float *cd1l1,*cd2H/cd1l2; 

685 float *a1c,*cd,*cl,*cdi/dc; 

686 float blv, returnVal; 
687 

688 if (aLength > M AX_SIGNAL_LENGTH||bLength > M AX_SIG N ALJ.ENGTH) 

689 DoError( M SlopeCMatch: maximum signal length exceeded An", NULL); 
690 

691 slope = (float)aLength/(float)bLength; 

692 if ((slope>MAXJLOPE)||{1/slope>MAX_SLOPE)) { 

693 return BIG NUM; 

694 ) 
695 

696 f or (i = 0; i < aLength + 2; + + i) { 

697 costs2(i] = BIG_NUM; 

698 costs! [i] = BIG_NUM; 

699 costs0[i] = BIG_NUM; 

700 } 
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701 

702 costs0|2) = SquareDifference(*a1,*b1); 
703 

704 for (i = 1 ; i < bLength; + + i) { 

705 bottom = IMax(i/2 f 2*i+aLength-2*bLength); 

706 top = IMin(2*U/2+aLength-bLength/2)+ 1; 
707 

708 a1c = a1+bottom; /*a1 cursor*/ 

709 b1v = *{b1 /* b1 value */ 
710 

711 switch (i%3){ 

712 caseO: 

713 dc = costsO +bottom-2 + 2; 

714 cd2I1 = costs2 + bottom-2+2; 

715 cd1I2 = costs1+bottom-1+2; 

716 cd1H = costs2 + bottom- 1+2; 

717 break; 

718 easel: 

719 dc = costsl + bottom-2 + 2; 

720 cd2I1 = costsO + bottom-2+ 2; 

721 cd1i2 = costs2+ bottom- 1+2; 

722 cd1H = costsO + bottom- 1+2; 

723 break; 

724 case 2: 

725 dc = costs2+bottom-2+2; 

726 cd2H = costs 1 + bottom-2+2; 

727 cd1l2 = costsO + bottom-1+2; 
7 28 cd 1 1 1 = costs 1 + bottom- 1 + 2; 

729 break; 

730 } 

731 *dc++ = BIG_NUM; 

732 *dc++ = BIG_NUM; 

733 for (j = bottom; j <top; + + j) { 

734 float d2M,d1 12,d1 1 Vest; 

735 rest = SquareDifference{*a1c+ +,b1v); 

736 dH1 = *cdll1 + + + rest*centerWeight; 

737 dl 12 = *cdl!2 + + + rest; 

738 d2M = *cd2H + + + rest; 
739 

740 *dc++ = FMin(FMin(d1M ( d2l1),d1l2); 

741 } 
742 

743 switch (i % 3) { 

744 case 0: 

745 dc = costsO; 

746 break; 

747 easel: 

748 dc = costs1; 

749 break; 

750 case 2: 

751 dc = costs2; 

752 break; 

753 } 

754 } 
755 
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756 -i; 

757 switch (i% 3) { 

758 caseO: 

759 dc = costsO; 

760 break; 

761 easel: 

762 dc = costsl; 

763 break; 

764 case 2: 

765 dc = costs2; 

766 break; 

767 } 

768 returnVal = *(dc+aLength-1 +2); 
769 

770 if (lengthNormalize) 

771 return returnVal/sqrt(aLength*aLength + bLength*bLength); 

772 else 

773 return returnVal; 

774 } 
775 

776 

777 float SlopeCMatchAndPath(float *a1,float *a2,int aLength,float *b1,float *b2,int bLength, 

778 float centerWeight,BOOLEAN lengthNormalize,f loat topToBottom, 

779 FILE *pathFP) 

780 { 

781 Direction pathfMAX SIGNAL_LENGTH](MAX_SIGNAL_LENGTH],*pc; 

782 float costtO[MAX_SIGNAL_LENGTH+2]; 

783 float costsl [MAX SIGNALJ.ENGTH+2]; 

784 float costs2[MAX_SIGNAL_LENGTH + 2]; 

785 float slope.minVal; 

786 intij; 

787 intbottom,top; 

788 float *cd1l1,*cd2H,*cd1l2; 

789 float *a1c,*a2c,*cd,*cl i *cdl/dc; 

790 float b1v,b2v,returnVal; 
791 

792 /* printf("sc:\n"); */ 
793 

794 if (aLength > MAX_SIGNAL_LENGTH||bLength > MAXJIGN ALJ.ENGTH) 

795 DoErrorCSIopeCMatch; maximum signal length exceeded An", NULL); 
796 

797 slope = (float)aLength/(float)bLength; 

798 if (<slope>MAX_SLOPE)||(1/slope>MAX_SLOPE)) { 

799 return BIGJMUM; 

800 } 
801 

802 for (i = 0;i<a Length + 2; + + i) { 

803 costs2[i) = BIG.NUM; 

804 costsl [i] = BIG.NUM; 

805 costs0[i) = BIG_NUM; 

806 } 
807 

808 pc = &(path[0][0]); 

809 *pc+ + = NONE; 

810 costs0[2] = SquareDifference(*a1 l *b1)*topToBottom+5quareDifference(*a2/b2); 
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812 


f or (i = 1 ; i < bLength; + + i) { 


813 


bottom = JMax(i/2,2*i+aLength-2*bLength); 


814 


top = IMin(2*i,i/2+aLength-bLength/2)+1; 


815 




816 


ale = a1 +bottom; /* a1 cursor */ 


817 


a2c = a2+ bottom; /* a2 cursor */ 


818 


b1v = *(b1 + i);/*b1 value */ 


819 


b2v= *(b2+i);/*b2 value*/ 


820 




821 


switch (i%3){ 


822 


case 0: 


823 


dc = costs0+bottom-2+2; 


824 


cd2M = costs2+bottom-2 + 2; 


825 


cd 1 12 = costs 1 + bottom-1 + 2; 


826 


cdlll = costs2+bottom-1 +2; 


827 


break; 


828 


case 1 : 


829 


dc = costsH-bottom-2+2; 


830 


cd2H = costs0+bottom-2+2; 


831 


cd1!2 = costs2+bottom-1+2; 


832 


cd1M = costsO + bottom- 1+ 2; 


833 


break; 


834 


case 2: 


835 


dc = costs2+bottom-2+2; 


836 


cd2H = costs1 + bottom-2+2; 


837 


cd1l2 = costs0+ bottom- 1+2; 


838 


cd1!1 = costs 1 + bottom- 1 +2; 


839 


break; 


840 


} 


841 


*dc+ + = BIG_NUM; 


842 


*dc++ = BIG^NUM; 


843 


pc = &(path[ijfbottom]); 


844 


for (j = bottom; j <top; + + j) { 


845 


float d2H,d1l2,d1H J rest; 


846 




847 


rest = SquareDifference(*a1c+ -4-,b1v)*topToBottom + 


848 


SquareDifference(*a2c+ + ,b2v); 


849 


dill = *cdl!1 + + + rest*centerWeight; 


850 


d1l2 = *cdl!2 + + + rest; 


851 


d2M = *cd2H + + + rest; 


852 




853 


if(d1lKd1l2) 


854 


if(d1lKd2H){ 


855 


*dc+ + = dill; 


856 


*pc++ = DILI; 


857 


} 


858 


else { 


859 


*dc++ = d2H; 


860 


*pc+ + = D2L1; 


861 


} 


862 


else 


863 


if{d1l2<d2H){ 


864 


*dc+ + =d1l2; 


865 


*pc++=D1L2; 
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866 } 

867 else { 

868 *dc++=d2H; 

869 *pc+ + = D2L1; 

870 } 

871 } 
872 

873 switch (i % 3) { 

874 case 0: 

875 dc = costsO; 

876 break; 

877 easel: 

878 dc = costsl; 

879 break; 

880 case 2: 

881 dc = costs2; 

882 break; 

883 } 

884 minVal = BIG_NUM; 

885 printf( rto /o6d; M f i); 

886 for (j = 0; j <aLength + 2; + + j) { 

887 if (*dc < = minVal) 

888 minVal = *dc; 

889 if (*dc+ + > = BIG NUM) 

890 printfC "); 

891 else 

892 printfC*"); 

893 } 

894 printf( M %6.2f\n",minVal); 

895 } 
896 

897 «i; 

898 switch (i%3){ 

899 caseO: 

900 dc = costsO; 

901 break; 

902 case 1 : 

903 dc = costsl; 

904 break; 

905 case 2: 

906 dc = costs2; 

907 break; 

908 } 

909 returnVal = *(dc + aLength-1 +2); 
910 

911 PrintPath(a1 4 a2,aLength,b1,b2 # bLength,pathJ,returnVal ( pathFP); 
912 

913 if (lengthNormalize) 

914 return returnVal/sqrt(aLength*aLength + bLength*bLength); 

915 else 

916 return returnVal; 

917 } 
918 
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Aug 2 02:29 1991 recogDesc.c 



1 #include <$tdio.h> 

2 #include "mylib.h" 

3 #indude "types.h** 

4 #include"dict.h M 

5 #include , 'diff.h ,, 
6 

7 #define BIGJMUM (10e10) 

8 #def ine MAXJ/VORDS (100) 

9 #defineMAX FONTS (10) 
10 

1 1 extern double sqrt(double); 
12 

13 float CompareNumericDescriptors(float *a,float *b,int length) 

14 { 

15 rnti; 

16 float sum; 

17 for(i=0,sum==0;i<length; + +i){ 

18 sum + = (*a-*b)*(*a-*b); 

19 + +a; 

20 ++b; 

21 } 

22 return sqrt(sum); 

23 } 
24 

25 float *ComputeNumericDescriptor(int modellndex.Dictionary models, 

26 Dictionary *fonts,int numberOfFontsjnt numberOfWords, 

27 DiffDescriptordd, 

28 float *sd,float*avg) 

29 { 

30 float *d; 

31 inti,j; 

32 float temp; 

33 float sumxxIMAX WORDS); 

34 float sdev[MAX_WORDS]; 

35 float sumsdev,sumscore; 
36 

37 if ((d = (float *)calloc(numberOfWord$,sizeof (float))) = = NULL) 

38 DoError("ComputeNumericDescriptor: cannot allocate space.\n H ,NULL); 

39 for (j = 0; j < numberOfWords; + + j) 

40 sumxx[j] = 0; 

41 f or (i = 0; i < numberOf Fonts; + + i) 

42 for(j = 0;j<numberOfWord$;+ +j){ 

43 temp = DiffPair(*(models*>outlines+modellndex),*(fonts[i]->outlines+j) l dd); 

44 if (temp <BIG_NUM){ 

45 d[j]+=temp; 

46 sumxx(jj + =temp*temp; 

47 } 

48 } 
49 

50 if (numberOfFonts > 1) { 

51 float sum,minsdev,maxsdev; 

52 for (j =0;j <numberOfWords; + + j) 
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53 sdevlj] = sqrt((numberOfFonts*sumxx[j]-d[j]*d[j])/numberOfFonts/(numberOfFonts-1)); 

54 forfl = 0,sumsdev=0,sumscore=0;j<numberOfWords; + +j) { 

55 sumsdev + = sdevljj; 

56 sumscore + = d[j]; 

57 } 

58 *sd = sumsdev/numberOfWords; 

59 *avg = sumscore/numberOfWords; 

60 } 
61 

62 for(j = 0;j<numberOfWords; + +j) 

63 dIj]/ = numberOfFonts; 
64 

65 return d; 

66 } 
67 

68 typedef struct { 

69 float score; 

70 intx; 

71 inty; 

72 } *CompareTuple # CompareTupIeBody; 
73 

74 int TupleLessThan(CompareTuple *x,CompareTuple *y) 

75 { 

76 if ((*x)->score = = (*y)->score) 

77 return 0; 

78 else if ((*x)- > score < (*y)- > score) 

79 return -1; 

80 else 

81 return 1; 

82 } 
83 

84 

85 void DoDescriptors{Dictionary models,char *modelName,char **wordNames, 

86 int numberOfFonts,Dictionary *fonts,char **fontNames, 

87 int numberOfWords,DiffDescriptordd) 

88 { 

89 float *descriptors[MAX_WORDSJ; 

90 intclasses[MAX_WORD5][MAX WORDS); 

91 float sdev[MAX^WORDS] f avg[MAX_WORO$]; 

92 CompareTupleBody tuples[M AX_WORDS*MAX._WORDS]; 

93 CompareTuple scores[MAXJAfORDS*MAX_WORDS]; 

94 inti,x ( y,j; 

95 int count; 

96 /* float threshold = 0.22; */ 

97 float threshold = 0.42; 
98 

99 f or (i - 0; i < numberOfWords; + + i) { 

100 descriptors!)] = 
ComputeNumericDescriptorO.models^fonts^numberOfFonts^umberOfWords.dd, 

101 sdev+i,avg + i); 

1 02 fprintf(stdout/%s: y 0 6.4f y 0 6.4f\n\wordNamesIi),avg[i],sdev[i]); 

103 } 

104 fprintf(stdout f "\n\n w ); 
105 

106 for (y = 0;y<numberOfWords; + +y) 
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1 07 for (x = 0; x < numberOfWords; + + x) 

108 cla$sesly][x] = 

(CompareNumericDescriptors(de$criptors[y],descriptors[x] # numberOfWords) 

109 < threshold); 
110 

111 

112 #ifdeffoo 

1 13 for (y=0,i= 0;y< numberOfWords; + +y) 

114 for(x=0;x<y;++x){ 

115 CompareTupletemp; 

116 /* 

1 17 temp = (CompareTuple)calloc(1,$izeof(CompareTupleBody)); 

118 if (temp = = NULL) 

119 DoError(": cannot allocate spaceArT.NULL); 

120 */ 

121 temp = tuples* i; 

122 temp->score = 
CompareNumericDescriptorsCdescriptorsfyJ^escriptorsIxl.numberOfWords); 

123 temp->x = x; 

124 temp->y = y; 

1 25 scores[i] = temp; 
126 

127 } 

128 qsort(scores,i / sizeof(CompareTuple),TupleLessThan); 
129 

130 for(j = 0;j<i; + +j) 

131 fprintf(stdbuV(%s f %s): 

%f\n" # wordNameslscores[j]->y] f wordNames[scores[j]->x),scores[j]->score); 

132 #endif 
133 

134 fprintffctdout/'XrAn"); 

135 f or (i = 0; i < numberOfWords; + + i) { 

1 36 CompareTuple temp; 

137 float *thisDesc; 

138 float junk; 

139 thisDesc = 

ComputeNumericDescriptorO^model^&modelsJ^numberOfWords.dd^junlc&junk); 

140 for (j=0;j< numberOfWords; + +j){ 

141 temp = tuples + j; 

142 temp->score = CompareNumericDescriptors(thisDesc,descriptorslj],numberOfWords); 

143 temp->y = i; 

144 temp->x = j; 

145 scores[j) = temp; 

146 } 

147 qsort(scores J numberOfWords,sizeof(CompareTuple)JupleLessThan); 
148 

149 fprintf(stdout, H %s: '\wordName$[i]); 

1 50 for (j = 0;] < 5&&j < numberOfWords; + + j) { 

151 fprintffstdout/Ks M ,wordNames[scores[j]->x]); 

152 if (scores[j)->x == i) 

153 break; 

154 } 

155 if(scores|j]->x==i) 

156 fprintf(stdout, M \n u ); 

157 etse{ 
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158 for (;j< numberOfWords; + +j) 

159 if (scoreslj]->x==i) 

160 break; 

161 fprintffstdout, 1 ' (%d more)\n",j-5); 

162 } 
163 

164 fprintf{stdout, tt "); 

165 count = 0; 

1 66 for (j = 0;j < numberOfWords; + + j) 

167 if (classes[scores[0)->x][j]) { 

168 fprintf(stdout,"%s w ,wordNames[j]); 

169 + + count; 

170 if (count > 5) 

171 break; 

172 } 

173 if (j < nu mberOfWords) { 

1 74 for <count=0;j < numberOfWords; + +j) 

175 if (dasses[scores[0]->x][jl) 

176 + 4- count; 

177 fprintf(stdout," (% more)\n ".count); 

178 } 

179 else 

180 fprintf(stdouC\n"); 
181 

182 free(thisDesc); 

183 } 

184 } 
185 

186 void main(int argc,char**argv) 

187 { 

188 char*listFile; 

189 Dictionary models; 

190 char *modelName; 

191 int numberOfFonts; 

192 Dictionary fonts[MAX_FONTS); 

1 93 char *fontNames[MAX JONTSJ; 

1 94 char *wordNames[MAX_WORDS]; 

1 95 int numberOfWords; 

1 96 float centerWeight; 

197 int normalBand Width; 

198 BOOLEAN 

lengthNormalize r useL2,slopeConstrain l warp,topToBottomOption / hillToValleyOption; 

199 BOOLEAN separate; 

200 float topToBottom,hillToValleyLocal; 

201 FILE *listfp; 

202 inti,x,y; 

203 DiffDescriptorBody dd; 
204 

205 centerWeight = 1.0; 

206 normalBandWidth = 20; 

207 topToBottom = 1.0; 

208 hillToValleyLocal =: 1.0; 

209 DefArg( M %$Y , listFile ,, ,&listFile); 

210 DefOption("-L2 ,, /-L2 B r &useL2); 

211 DefOption( M -slopeConstrain %fY'-slopeConstrain <center weight> M , 
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212 &slopeConstrain,&centerWeight); 

213 DefOption("-warp %f %d°, M -warp < center weight > < band width > M , 

214 &warp,&centerWeight,&normalBandWidth); 

215 DefOption( B -separate"/-separate w / &separate); 

216 DefOption( 0 -normali2e tt / , -normalize 0 # &lengthNormalize); 

217 DefOption(°-topToBottom %fV-topToBottom 
< ratio > w / &topToBottomOption l &topToBottom); 

218 DefOption{"-hillToVal!ey%f ,, /'-hillToValley 
<ratio> tt # &hillToValleyOption # &hil!ToValleyLoca!); 

219 Sea n Args(a rg c,a rg v) ; 
220 

221 if ({listfp = fopen(listFile, M r")) = =NULL) 

222 DoError( M Error opening file %sAn M ,listFile); 
223 

224 /* Read in the number of words in each dictionary */ 

225 numberOfWords = Readlnt(listfp); 

226 if (numberOfWords > MAX.WORDS) 

227 DoError( H %s: too many wordsAn° ( argvlO)); 
228 

229 /* Read in the words */ 

230 for (i = 0;i < numberOfWords; + + i) { 

231 wordNames[i] = Read String (listfp); 

232 } 
233 

234 /* Read in the model dictionary */ 

235 modeiName = ReadString(listfp); 

236 models = ReadDictionary(modelName); 
237 

238 /* Read in the number of dictionaries */ 

239 numberOfFonts = Readlnt(listfp); 

240 if (numberOfFonts > MAXJONTS) 

241 DoError("%s: too many dictionaries.\n , \argv[0]); 
242 

243 /* Read in the dictionaries and their names*/ 

244 for (i =0;i<numberOf Fonts; + + i){ 

245 fontNames[i] = ReadString(iistfp); 

246 fontsli] = ReadDictionary(fontNames[i]); 

247 } 
248 

249 /* Check to see that all dictionaries have the same number of shapes as the specified number 
of words. */ 

250 for (i = 1 ; i< numberOfFonts; + + i) 

251 if (fonts[i]-> numberOfEntries < numberOfWords) 

252 DoErrorj ["Dictionary %s has too few entries.\n\f ontNamesp]); 

253 if (models-> numberOfEntries < numberOfWords) 

254 DoError( w Model dictionary has too few of entriesAn",NULL); 
255 

256 

257 if(useL2){ 

258 fprintf(stdout, H Using L2 on length normalized shapesAn"); 

259 dd.diffType = L2; 

260 } 

261 else if (slopeConstrain) { 

262 f printf(stdout, "Using dynamic time warping with slope contrained to I0.5,2].\n"); 

263 dd.diffType = CONSTRAINED; 
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282 

283 
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286 
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304 



305 



dd. separate = separate; 
if (separate) 

fprintf(stdout/Top and bottom warped separatelyAn"); 
else 

fprintf(stdout,"Top and bottom warped togetherAn"); 

} 

else { 

fprintf($tdout, u Using dynamic time warping with bandwidth %dAn",normalBandWidth); 
dd.diffType = WARP; 
dd.bandWidth = norma iBandWidth; 
dd.separate = separate; 
if (separate) 

fprintffstdout/'Topand bottom warped separatelyAn"); 
else 

fprintf(stdout,"Top and bottom warped togetherAn"); 

} 

if (!useL2){ 

fprintf(stdout,"Center weight = % f An", centerWeight); 
dd.centerWeight = centerWeight; 
if (lengthNormalize){ 
dd.lengthNormalize = TRUE; 

fprintf(stdout, "Scores normalized by signal lengthAn"); 

} 

else 

dd.lengthNormalize = FALSE; 

} 

dd.hiilToValley = hillToVafleyLocal; 
dd.topToBottom = topToBottom; 
dd.pathFP = NULL; 

f printf(stdout, M Words: \n w ); 
f or (i = 0; i < numberOfWords; + + i) 
fprintf(stdout/%d: %s\n w ,i,wordNames[i]); 
fprintf(stdout,"\n M ); 

fprintf(stdout," Model font is yosAn^^odelName); 
fprintf(stdout, M Fonts:\n'*); 
f or (i = 0; i < numberOf Fonts; + + i) 
fprintf(stdout,"%d: %s\n M ,i,fontNames[i]); 
fprintf(stdout/'\n"); 



DoDescriptors(models,modelName,wordNames r numberOfFonts,fonts,fontNames,numberO 

fWords,&dd); 

} 
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Jun 1816:201991 resample.c 



1 #include <stdio.h> 

2 #include <values.h> 

3 t #include <$tring.h> 

4 #include <floatingpointh> 

5 #inctude u booIean.h B 

6 #include "types.h" 

7 #include "error.h" 

8 #include "dicth" 
9 

10 void Resample(OutlinePair$ignal,float factor) 

11 { 

12 inti,count; 

13 float pivot; 

14 float delFactor; 

15 float *oldTop/newTop; 

16 float *oIdBottom,*newBottom; 

17 float *topSPtr/topDPtr; 

18 float *bottomSPtr,*bottomDPtr; 
19 

20 delFactor = 1.0 - factor; 

21 for(i=0 # count=0,pivot=0.0;i<signal->numberOfLegs; + +i){ 

22 if (pivot > =1.0) { 

23 pivot -= 1.0; 

24 pivot += delFactor; 

25 } 

26 else { 

27 pivot += delFactor; 

28 ++ count; 

29 } 

30 } 
31 

32 newTop = (float *)calloc(count,sizeof(f loat)); 

33 newBottom = (float *)calloc(count,sizeof(float)); 

34 if ((newTop= = NULL)||(newBottom= =NULU) 

35 DoError( w Resample: cannot allocate space.\n",NULL); 
36 

37 oldTop = signal->top; 

38 oldBottom = $ig na I- > bottom; 
39 

40 topSPtr = signal- > top; 

41 bottomSPtr = signal->bottom; 

42 topDPtr = newTop; 

43 bottomDPtr = newBottom; 

44 for(i=0 # pivot = 0.0;i<signal->numberOfLegs; + +i) ( 

45 if (pivot > = 1.0) { 

46 pivot -= 1.0; 

47 pivot + = delFactor; 

48 ++topSPtr; 

49 ++ bottomSPtr; 

50 } 

51 else{ 

52 pivot + = delFactor; 
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53 *topDPtr+ + = *topSPtr+ + ; 

54 *bottomDPtr+ + = *bottom$Ptr+ + ; 

55 } 

56 } 
57 

58 signal- > top = newTop; 

59 signal- > bottom = newBottom; 

60 signal- >numberOf Legs = count; 
61 

62 free(oldTop); 

63 free(oldBottom); 

64 } 
65 

66 void main(int argcchar **argv) 

67 { 

68 char *inFile f *outFile; 

69 float factor; 

70 inti; ^ 

71 Dictionary diet; 
72 

73 if(argc!=4){ 

74 fprintf(stderr ( "Usage:\n"); 

75 fprintf(stderr," %s <inputfile> <outputfile> <compression factor>\n",argv[0]); 

76 f printf(stderr, M Compresses shapes horizontallyAn"); 

77 exit{-1); 

78 } 
79 

80 if ({factor> = 1.0) ||(f actor < 0.0)) 

81 DoError("%s: factor must be between 0 and 1.\n u ,argv[0]); 
82 

83 inFile = argv[1]; 

84 outFile = argv[2]; 

85 factor = atof(argv[3]); 
86 

87 dia = ReadDictionary(inFile); 
88 

89 for (i= 0; i <dict- > numberOf Entries; + + i) 

90 Resample(*(dict-> outlines* i),factor); 
91 

92 WriteDictionary(dict f outFile); 

93 } 
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Jul 31 16:48 1991 sepMatckc 



1 #include <stdio.h> 

2 #indude n mylib.h ,, 

3 #indude"misc.h u 
4 

5 #def ine MAX SIGNAL LENGTH (800) 

6 #def ine MAX_SLOPE (2.0) 

7 #def ine BIG J3uM (10e20) 
8 

9 typedef enum {NONE,LEFT ( DOWN,DOWNLEFT,D1L1,D2L1,D1L2} Direction; 
10 

1 1 extern double sqrt(double); 

12 extern double co s(double); 

13 extern double atan(double); 

14 extern int irint(double); 
15 

16 /* Assumes that a represents the model and b represents the unknown. 

1 7 * Weights places where the model is lower than the unknown more than 

18 * cases where the model is higher than the unknown. The idea here is 

19 * that valleys can be filled in by bleeding together, but that noise 

20 * can rarely make a contour be too tall for extended periods. 

21 */ 

22 float hiliToValley = 1.0; 

23 inline float 5quareDifference(f loat a/float b) 

24 { 

25 float temp = a-b; 

26 if {temp <0) 

27 return temp*temp; 

28 else 

29 return temp*temp*hillToValley*hi]IToValley; 

30 /* return (a-b)*(a-b); */ 

31 } 
32 

33 inline float FMax(f loat afloat b) 

34 { 

35 if(a>b) 

36 return a; 

37 else 

38 return b; 

39 } 
40 

41 inline float FMin(f loat a/float b) 

42 { 

43 if (a < b) 

44 return a; 

45 else 

46 return b; 

47 } 
48 

49 inline int IMax(int a,int b) 

50 { 

51 if (a>b) 

52 return a; 
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else 

return b; 

} 

inline int IMintfnt a,int b) 
{ 

if (a<b) 
return a; 
else 

return b; 

} 

float NewMatch(float *a1, float *a2,int aLength,float *b1,float *b2,int bLength, 
float centerWeight,BOOLEAN lengthNormalizeJnt normalBandWidth, 
float topToBottom) 



float costsOIMAX SIGNAL LENGTH + 1]; 
float costs! [MAX_SIGNALJ.ENGTH + 1]; 
int i,j,start,end,bandWidth,shift; 
int realStart,realEnd,center,oldEnd; 
float slope, angle; 
float *a1c l *a2c,*cd ( *cl,*cdl,*dc; 
float oldCost,b1v,b2v,returnVai; 

if (aLength>MAX_SIGNAL_LENGTH||bLength>MAX_SIGNAL_LENGTH) 
DoErrorCNewMatch: maximum signal length exceeded.\n",NULL); 

slope = (float)aLength/(float)bLength; 

if ((slope > MAX.SLOPE)|[(1 /slope >MAX_SLOPE)) { 
return BIG_NUM; 

} 

angle = atan(slope); 

bandwidth = irint(normalBandWidth/cos(angle)); 
center = 0; 

realStart = center-bandWidth/2; 
realEnd = realStart+bandWidth; 
end = FMin(realEnd,aLength); 

a 1c = a1; /* a1 cursor */ 
a2c = a2; /* a2 cursor */ 
b1v = *b1; /*b1 value */ 
b2v = *b2; /* b2 value*/ 
dc = costsO; 
*dc+ + = BIG_NUM; 
oldCost = *dc+ +" = 

SquareDifference(*a1c+ +,b1v)*topToBottom + SquareDifference(*a2c+ + ,b2v); 

for(j=1;j<end;+ +j) 
oldCost = *dc+ + = 

oldCost+SquareDifference(*a1c + +,b1v)*topToBottom + SquareDifference(*a2c+ + ,b2v); 

for(i=1;i<bLength;++i){ 
/* Compute new center of band */ 
center = irint(slope*i); 
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106 realStart = center-band Width/2; 

107 realEnd = realStart* bandwidth; 

108 start = FMax(realStart,Q); 

109 oldEnd = end; 

110 end = FMin(realEnd,aLength); 

111 shift = end-oldEnd; 
112 

113 /* put large numbers where bands don't overlap */ 

114 for(j = 0;j<shift; + +j) 

115 *dc++ = BIG NUM; 
116 

117 a1c = al+start; /*a1 cursor*/ 

118 a2c = a2+start; /*a2 cursor*/ 

119 b1v= *(b1+i);/*b1 value*/ 

120 b2v= *(b2+i); /* b2 value*/ 

121 if(i&1){ 

122 cd = costs1+start-1 + 1; /* cursor down??? What about -1???*/ 

1 23 cdl = costs0+start-1 + 1; /* cursor down left */ 

124 cl = costsO+start-M; /* cursorleft*/ 

125 dc- costs1 + start+1; /* destination cursor */ 

126 } 

127 else{ 

128 cd = costs0+start-1 + 1; /* cursor down */ 

129 cdl = costs1+start-1 + 1;/* cursor down left*/ 

130 cl = costsl + start+1; /* cursor left */ 

131 dc = costs0+start+1; /* destination cursor */ 

132 } 

133 *cd = BIGJMUM; 

1 34 for (j = start;) < end; + + j) { 

1 35 float down Jef t,d own Left, rest; 

136 down = *cd+ + + rest; 

137 left = *cl++ + rest; 

138 downLeft = *cdt + + + rest*centerWeight; 

139 rest = 5quareDifference(*a1c+ +,b1 v)*topToBottom+ 

140 SquareDrfference(*a2c+ +,b2v); 

141 *dc++ = FMin(FMin(down,left),downLeft); 

142 } 

143 } 
144 

145 i«; 

146 if(i&1) 

147 dc = costs1+aLength-1 + 1; 

148 else 

149 dc = costs0+aLength-1 + 1; 

150 returnVal = *dc; 
151 

1 52 if (lengthNormalize) 

153 return returnVal/sqrt(aLength*aLength + bLength*bLength); 

154 else 

155 return returnVal; 

156 } 
157 

158 

159 void PrintPath(float *a1,float *a2,intaLength,float *b1,float *b2,int bLength, 

160 Direction pathlMAX_SIGNAL_LENGTH][MAX_SlGNAL.LENGTH]Jnt i, 
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161 float returnVal, 

162 FILE *pathFP) 

163 { 

164 intx,y,j; 

165 int length = 0; 

166 int index = 0; 

167 float newTop1[MAX_SIGNAL_LENGTH],newBottom1 [MAX JIGNALJ.ENGTH]; 

168 float newTop2(MAX SIGNAL LENGTH], newBottom2(MAX SIGNAL LENGTH]; 
169 

170 y = i; 

171 x = aLength-1; 

172 while {pat h [y 1 [x] I = N ON E) { 

173 switch (path [y][x]){ 

174 case DOWN: 

175 x»; 

176 break; 

177 case LEFT; 

178 y~; 

179 break; 

180 caseD1L1: 

181 case DOWNLEFT: 

182 x»; 

183 y«; 

184 break; 

185 caseD2L1: 

186 x- = 2; 

187 y«; 

188 break; 

189 caseD1L2: 

190 x-; 

191 y- = 2; 

192 break; 

193 default: 

194 DoError("NewMatchAndPath: Internal error - bad case.\n M ,NULL); 

195 } 

196 ++length; 

197 } 
198 

199 y=i; 

200 x = aLength-1; 

201 while (path[y][x] I = NONE) { 

202 if (index> = MAX_SIGNAL_LENGTH) 

203 DoError("NewMatchAddPath: warped signal is too long.\n'\NULL); 

204 newTopI [length-index] = a1[x]; 

205 newBottoml [length-index] = a2[x); 

206 newTop2[length-index] = b1[y]; 

207 newBottom2[length-index] = b2[y]; 

208 switch (path[y][xl) { 

209 case DOWN: 

210 x--; 

211 break; 

212 case LEFT: 

213 y~; 

214 break; 

215 caseD1L1: 
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216 case DOWNLEFT: 

217 x«; 

218 y~; 

219 break; 

220 caseD2L1: 

221 x-=2; 

222 y-; 

223 break; 

224 caseD1L2: 

225 x-; 

226 y- = 2; 

227 break; 

228 default: 

229 DoErrorfNewMatchAndPath: Internal error -bad case.\n*\NULL); 

230 } 

231 + + Index; 

232 } 

233 if {index > = MAX_SIGNAL_LENGTH) 

234 DoErrorfNewMatchAddPath: warped signal is too long.\n" f NULL); 

235 newTop! [length-index] ~ a1[xj; 

236 newBottoml [length-index] = a2(x]; 

237 newTop2[tength-index] = b1[y]; 

238 newBottom2[iength-index] = b2[y]; 

239 + + index; 
240 

241 for (j = 0; j< index; + + j) 

242 fprintf(pathFP/Vod %f\n tt .j,newTop1[j]); 

243 fprintf(pathFP, tt \ H top1\n\n M ); 
244 

245 for (j = 0;j< Index; + +]) 

246 fprintf(pathFP/%d %f\n M ,j,newTop2(j]>; 

247 fprintffpathFP/r^VnVn''); 
248 

249 for(j = 0;j<tndex; + +j) 

250 fprintf(pathFP/°/od %f\n^,-newBottom1[j]); 

251 f printf (pathFP, " \ " bottom 1 \n\n w ); 
252 

253 for(j = 0;j<index; + +j) 

254 fprintf(pathFP/'°/od 0 /of\n u ,j,-newBottom2[j]); 

255 fprintf(pathFP, M \ ,, bottom2\n\n ,, ); 
256 

257 { 

258 float checksum; 

259 fprintf{pathFP,"%d %fVn" f 0 ( checksum); 

260 for (j = 0,checksum = 0; j < index; + + j) { 

261 checksum +s=SquareDifference(newTop1[jI # newTop2[jl) + 

262 SquareDiff erence(newBottom1 [j],newBottom2[j]); 

263 fprintf(pathFP ( "%d %f\n"j i checksum); 

264 } 

265 printf( M checksum, score = %6.2f, %6.2f\n" ( checksum,returnVal); 

266 } 

267 } 
268 

269 
270 
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271 float NewMatchAndPath(float *a1,float *a2/mt aLength.float *bl l float *b2,int bLength, 

272 float centerWeight,BOOLEAN lengthNormalize.int normalBandWidth, 

273 float topToBottom,FILE *pathFP) 

274 { 

275 Direction path[MAXJIGNALLENGTH][MAXJIGNAL^LENGTH]/pc; 

276 int x,y; 

277 float costsO[MAX_SIGNAL_LENGTH + 11; 

278 float costs 1 [MAX_SIGNAL_LENGTH + 11; 

279 int i ,j,start,end,bandWidth,shift; 

280 int realStart,realEnd,center,oldEnd; 

281 float slope,angle; 

282 float *a1c,*a2c,*cd/cl,*cdl,*dc; 

283 float oldCost,b1v,b2v,returnVal; 
284 

285 if (aLength > M AX_SIGNAL_LENGTH||bLength > MAX JIGNALJ LENGTH) 

286 DoError("NewMatch: maximum signal length exceeded.\n",NULL); 
287 

288 slope = (float)aLength/(float)bLength; 
289 

290 if((s!ope>MAXJLOPE)||(1/slope>MAX SLOPE)) { 

291 return BIGJMUM; 

292 } 

293 angle = atan(slope); 

294 bandwidth = irint(normalBandWidth/cos(angle)); 

295 center = 0; 

296 realStart = center-bandWidth/2; 

297 realEnd = real$tart+ band Width; 

298 end = FMin(realEnd,aLength); 
299 

300 a1c = a1; /*a1 cursor*/ 

301 a2c = a2; /* a2 cursor*/ 

302 b1v = *b1;/*b1 value*/ 

303 b2v = *b2;/*b2 value*/ 

304 dc = costsO; 

305 pc = &(path(0]I0]); 

306 *dc+ + = BIG.NUM; 

307 oldCost = *dc+ + = 

SquareDifference(*a1c+ + ,b1v)*topToBottom + SquareDifference(*a2c+ + ,b2v); 

308 *pc++=NONE; 
309 

310 for(j=1;j<end; + +j){ 

311 oldCost = *dc++ = 

oldCost+SquareDifference(*ak+ + ,b1v)*topToBottom+SquareDifference(*a2c+ +,b2v); 

312 *pc++ = DOWN; 

313 } 
314 

315 #ifdeffoo 

316 printf( M %6d M J 0); 

317 for(j = 0;j<end; + +j) 

318 printf{"%6.2f \costs0[] + 1]); 

319 #endif 
320 

321 for (i= 1;i <bLength; -f -f i) { 

322 /* Compute new center of band */ 

323 center = irint(slope*i); 
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324 realStart = center-bandWidth/2; 

325 realEnd = realStart + bandwidth; 

326 start = FMax(realStart,0); 

327 oldEnd = end; 

328 end = FMin(realEnd,aLength); 

329 shift = end-oldEnd; 
330 

331 /* put large numbers where bands don't overlap */ 

332 f or (j = 0;j <shift; + + j) { 

333 /* printf( M %6.2f tt ,BIG NUM); */ 

334 *dc+ + = BIG NUM; 

335 } 

336 /* printf("\n 0 /o6d u ,i); */ 
337 

338 a1c = a1+start; /*a1 cursor*/ 

339 a2c = a2+start; /* a2 cursor */ 

340 blv = *(b1 + /* b1 value */ 

341 b2v = *<b2+i);/* b2 value*/ 

342 pc = &(path[i] [start]); 

343 if(i&1){ 

344 cd = costslH-start-1 + 1; /* cursor down 111 What about -1???*/ 

345 cdl = costsO+start-1 + 1; /* cursor down left */ 

346 cl = costs0+start+ 1 ; /* cursor left */ 

347 dc = costs! +start+ 1; /* destination cursor */ 

348 } 

349 else{ 

350 cd = costs0+start-1 + 1; /* cursor down */ 

351 cdl = costs! +start-1 + 1; ;* cursor down left */ 

352 cl = costs1+start+1; /* cursor left */ 

353 dc = costs0+start+ 1 ; /* destination cursor */ 

354 } 

355 *cd = BIG_NUM; 

356 for (j =start;j <end; + + j) { 

357 float down,left,downLeft,rest; 

358 rest = SquareDifference(*a1c+ + J b1v)*topTo Bottom + 
SquareDifference(*a2c+ + ,b2v); 

359 down = *cd+ + + rest; 

360 left = *cl + + + rest; 

361 downLeft = *cdl+ + + rest*centerWeight; 
362 

363 if (down < left) 

364 if (down < downLeft) { 

365 /* printf("%6.2f u ,down); */ 

366 *dc+ + = down; 

367 *pc+ + = DOWN; 

368 } 

369 else{ 

370 /* printf{ w %6.2f \downLeft); */ 

371 *dc++ = downLeft; 

372 *pc++ = DOWNLEFT; 

373 } 

374 else 

375 if (downLeft < left) { 

376 /* printf( M %6.2f u ,downLeft); */ 

377 *dc++ = downLeft; 
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378 *pc+ + = DOWNLEFT; 

379 } 

380 - else{ 

381 /* printf(*%6.2f Meft); */ 

382 . *dc++ = left; 

383 *pc++=LEFT; 

384 } 

385 } 

386 } 
387 

388 i~; 

389 if(i&1) 

390 dc = costsl + aLength-1 + 1; 

391 else 

392 dc= costsO+aLength-1 + 1; 

393 returnVal = *dc; 
394 

395 #ifdeffoo 

396 if(!doPath){ 

397 y = i; 

398 x-aLength-1; 

399 while (path[y)[x]! = NONE) { 

400 switch (path[y][x]) { 

401 case DOWN: 

402 x»; 

403 break; 

404 case LEFT: 

405 y-; 

406 break; 

407 case DOWNLEFT: 

408 x«; 

409 y~; 

410 break; 

411 default: 

412 DoErrorfNewMatchAndPath: Internal error- bad case.\n\NULL); 

413 } 

414 fprintf(pathFP, M %d %d\n" f x,y); 

415 } 

416 fprintftpathFP/o/od %d\n",x,y); 

417 } 

418 else 0 

419 #endif 

420 PrintPath(a1,a2,aLength ( b1,b2 ( bLength,path,ljeturnVal / pathFP); 
421 

422 if (lengthNormalize) 

423 return returnVal/sqrt(aLength*aLength+bLength*bLength); 

424 else 

425 return returnVal; 

426 } 
427 

428 

429 float SlopeCMatch(float*a1 I float*a2,intaLength # float*b1 l float*b2,intbLength, 

430 float centerWeight,BOOLEAN lengthNormalize,float topToBottom) 

431 { 

432 float costsOIM AX SIGNAL LENGTH + 2]; 



10/24/2003, EAST version: 1.4.1 



5,491,760 



305 



306 



Section C 



APPENDIX / Page 134 



433 
434 
435 
436 
437 
438 
439 
440 
441 
442 
443 
444 
445 
446 
447 
448 
449 
450 
451 
452 
453 
454 
455 
456 
457 
458 
459 
460 
461 
462 
463 
464 
465 
466 
467 
468 
469 
470 
471 
472 
473 
474 
475 
476 
477 
478 
479 
480 
481 
482 
483 
484 
485 
486 
487 



float costsl [MAX_SIGNAL_IENGTH + 2J; 
float costs2[MAXJIGNALJ.ENGTH + 2); 
float slope, minVal; 
intij; 

intbottom,top; 
float *cdW,*ed2M,*ed1 12; 
float *a1c ( *a2c/cd f *cl,*cdl,*dc; 
float b1v,b2v,returnVal; 

/* printf("sc:\n"); */ 

if (aLength > M AX.SIGN AL_LENGTH||bLength > MAXJIGN ALJ.ENGTH) 
DoError("SlopeCMatch: maximum signal length exceeded. \n\NULL); 

slope - (float)al_ength/(float)bLength; 
if ((sIope>MAXJLOPE)||(1/slope>MAX_SLOPE)) { 
return BIG NUM; 

} 

f o r (i = 0; i < aLength + 2; + + i) { 
costs2[i] = BIGJVJUM; 
costsl [i] = BIG NUM; 
costs0[i] = BIG_NUM; 

} 

costs0[2] = SquareDifference(*a1,*b1)*topToBottom+5quareDifference(*a2/b2); 

for(i=1;i<bLength; + +i){ 
bottom = IMax(i/2,2*i + aLength-2*bLength); 
top = lMin(2*U/2+aLength-bLength/2) + 1; 

a1c = a1+bottom; /*a1 cursor*/ 
a2c = a2+ bottom; /* a2 cursor */ 
b1v = *(b1 + i); /*b1 value */ 
b2v = *(b2+i); /* b2 value */ 

switch (i%3){ 
case 0: 

dc ~ costs0+bottom-2-f 2; 
cd2H = costs2+bottom-2 + 2; 
cdl!2 = costsl -fbottonvl +2; 
cdlH = costs2+bottom-1+2; 
break; 
case 1 : 

dc = costsl + bottom-2 + 2; 
cd2!1 = costs0+ bottom-2 +2; 
cd1l2 = costs2+ bottom- 1 +2; 
cdH1 = costs0+ bottom- 1 + 2; 
break; 
case 2: 

dc = costs2 + bottom-2+2; 
cd2H = costsl +bottom-2 + 2; 
cd1l2 = costs0+ bottom- 1+2; 
cd1t1 = costsl +bottom-1 +2; 
break; 



10/24/2003, EAST Version: 1.4.1 



5,491,760 

307 308 



Section C APPENDIX / Page 135 



488 } 

489 *dc++ = BIG_NUM; 

490 *dc+ + = BIG_NUM; 

491 for (j = bottom;] < top; + 4- j) { 

492 float d2ll,d1l2,d1l1 ( rest; 

493 rest = SquareDifference(*a1c+ + ,b1v)*topToBottom + 

494 SquareDiff erence(*a2c+ + ,b2v); 

495 dill = *cdUl + + + rest*centerWeight; 

496 d1l2 = *cd1l2+ + + rest; 

497 d2M = *cd2H + + + rest; 
498 

499 *dc+ + = FMin(FMin(d1l1,d2l1),d1l2); 

500 } 
501 

502 switch (i% 3) { 

503 case 0: 

504 dc = costsO; 

505 break; 

506 case 1 : 

507 dc=costs1; 

508 break; 

509 case 2: 

510 dc = costs2; 

511 break; 

512 } 
513 

514 #ifdeffoo 

515 ■ minVal = BIG NUM; 

516 printfr%6d: ",i); 

517 for(j=0;j<aLength + 2; + +j){ 

518 if (*dc <= minVal) 

519 minVal = *dc; 

520 if (*dc+ + >= BtG.NUM) 

521 printf(" "); 

522 else 

523 printf ("*"); 

524 } 

525 printff %6.2f\n",minVal); 

526 #endif 

527 } 
528 

529 -i; 

530 switch (i%3){ 

531 caseO; 

532 dc = costsO; 

533 break; 

534 easel: 

535 dc = costsl; 

536 break; 

537 case 2: 

538 dc = costs2; 

539 break; 

540 } 

541 returnVal = *(dc+aLength-1 + 2); 
542 
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543 if (lengthNormalize) 

544 return returnVal/sqrt(aLength*aLength+bLength*bLength); 

545 else 

. 546 return returnVal; 

547 } 
548 
549 

550 float SlopeCMatchAndPath(float *a1,float *a2,int aLength,f!oat *b1,f loat *b2,int bLength, 

551 float centerWeight,BOOLEAN lengthNormalize,f loat topToBottom, 

552 FILE *pathFP) 

553 { 

554 Direction path[MAX SIGNAL_LENGTHJ[MAX SIGNAL LENGTHJ/pc; 

555 float costsO[MAXSIGNAL LENGTH + 23; 

556 float eostsl [MAXJIGNAL.LENGTH +2]; 

557 float costs2[MAX_SIGNAL_LENGTH+2]; 

558 float slope,minVal; 

559 inti.j; 

560 int bottom,top; 

561 float *cd1l1,*cd2H,*cd1l2; 

562 float *a1c,*a2c,*cd,*d,*cdl,*dc; 

563 float b1v,b2v # returnVaf; 
564 

565 /* printf( M sc:\n M ); */ 
566 

567 if (aLength >MAX_SIGNAL_LENGTH||bLength>MAX_StGNAL_LENGTH) 

568 DoError("SlopeCMatch: maximum signal length exceed edAn", NULL); 
569 

570 slope = (float)aLength/(f!oat)bLength; 

571 if ((slope>MAX_SLOPE)||(1/slope>MAX SLOPE)) { 

572 return BIG NUM; 

573 } 
574 

575 for(i=0;i<aLength + 2; + + i){ 

576 costs2[il = BIG NUM; 

577 costs1[i] = BIGlNUM; 

578 costs0[i] = BIG_NUM; 

579 } 
580 

581 pc = &(path[0][0]); 

582 *pc++-NONE; 

583 costs0[2] = SquareDifference(*a1 / *b1)*topToBottom+5quareDifference(*a2 l *b2); 
584 

585 for (i= 1 ;i<bLength; + + i) { 

586 bottom = IMax(i/2,2*i+aLength-2* bLength); 

587 top = IMin(2*i ( i/2+aLength-bLength/2)+1; 
588 

589 a1c = a1 -hbottom; /* a1 cursor */ 

590 a2c = a2+bottom; /* a2 cursor */ 

591 blv = *(b1 /* bl value */ 

592 b2v= *(b2+i);/*b2 value*/ 
593 

594 switch (i%3){ 

595 caseO: 

596 dc = costsO + bottom-2 + 2; 

597 cd2H = cosU2 + bottom-2 + 2; 
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598 cd1!2 ~ costsl + bottom- 1 +2; 

599 cdH1 = costs2 + bottom- 1 + 2; 

600 break; 

601 easel: 

602 dc = costs 1 + bottom-2 + 2; 

603 cd2M = costsO + bottom-2 + 2; 

604 cd1l2 = costs2 + bottom- 1 +2; 

605 cd 1 1 1 = costsO + bottom- 1 + 2; 

606 break; 

607 case 2: 

608 dc = costs2 + bottom-2 + 2; 

609 cd2H = costsl + bottom-2 + 2; 

610 cdl!2 = costsO* bottom- 1 +2; 

611 cdlM = costsl + bottom-1 +2; 

612 break; 

613 } 

614 *dc+ + = BIG NUM; 

615 *dc++ = BIGJWM; 

616 pc = &(path[i]l"bottom]); 

617 f or (j = bottom; j < top; + + j) { 

618 float d2H,d1l2,d1l Vest; 
619 

620 rest = SquareDifference{*a1c+ +,Mv)*topToBottom+ 

621 SquareDifference(*a2c+ +,b2v); 

622 dH1 = *cdU1 + + + rest*centerWeight; 

623 d1l2 = *cd1l2+ + -I- rest; 

624 d2M = *cd2H + + + rest; 
625 

626 if (dill <d1 12) 

627 if(d1lKd2H){ 

628 *dc++=dH1; 

629 *pc++ = D1L1; 

630 } 

631 else{ 

632 *dc++ =d2t1; 

633 *pc+ + = D2L1; 

634 } 

635 else 

636 if(d1l2<d2M){ . 

637 *dc++ = d1!2; 

638 *pc+ + = D1L2; 

639 } 

640 else { 

641 *dc+ + = d2M; 

642 *pc+ + = D2L1; 

643 } 

644 } 
645 

646 switch (i % 3) { 

647 case 0: 

648 dc = costsO; 

649 break; 

650 case 1 : 

651 dc = costsl; 

652 . break; 
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653 case 2: 

654 dc = costs2; 

655 break; 

656 } 

657 minVal = BIG NUM; 

658 printf( a %6d: 

659 forO = 0;j<aLength + 2; + +j){ 

660 if (*dc<= minVal) 

661 minVal = *dc; 

662 if (*dc+ + > = BIG NUM) 

663 printf( a M ); 

664 else 

665 printfC*"); 

666 } 

667 printf( M 0 /o6.2f\n u , minVal); 

668 } 
669 

670 -i; 

671 switch (i% 3) { 

672 caseO: 

673 dc = costsO; 

674 break; 

675 easel: 

676 dc = costsl; 

677 break; 

678 case 2: 

679 dc = costs2; 

680 break; 

681 } 

682 returnVal = *(dc+aLength-1 + 2); 
683 

684 PrintPath(a1 f a2 r aLength,b1 / b2 ( bLength i path / i,returnVal t pathFP); 
685 

686 if (lengthNormaiize) 

687 return returnVal/sqrt(aLength*aLength + bLength*bLength); 

688 else 

689 return returnVal; 

690 } 
691 
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Jul 31 17:141991 single.c 



1 #include <stdio.h> 

2 #include"mylib.h" 

3 #include "types.h" 

4 #include y dict.h" 

5 #include tt drff.h' 1 

6 #include "rnatch.h" 

7 #include "matchparallel.h" 
8 

9 main(argc, argv) 

10 intargc; 

11 char*argv[]; 

12 { 

13 char *dictFile1/dictFile2,*outFile; 

14 intshape1 i shape2; 

15 Dictionary dictl,dict2; 

16 float score; 

17 char *matchType; 

18 float centerWeight,normalBandWidth,topToBottom,hillToValleyLocal; 

19 DiffDescriptorBody dd; 

20 FILE *pathFP; 

21 BOOLEAN 

useL2 i 5lopeConstrain 1 warp / lengthNormaIize ( topToBottomOption f hillToValieyOption; 

22 BOOLEAN separate; 
23 

24 centerWeight = 1.0; 

25 normalBandWidth = 20; 

26 topToBottom = 1.0; 

27 hillToValleyLocal = 1.0; 

28 DefArgf %s%d %s %d %s"/ , dict1 shapel dict2 shape2 outfiie w ,&dietFile1,&shape1, 

29 &dictFile2,&shape2,&outFile); 

30 DefOption("-L2 a l "-L2 , \&useL2); 

31 DefOption("*slopeConstrain %f , / "-slopeConstrain <center weight> \ 

32 &slopeConstrain,&centerWeight); 

33 DefOptionf-warp %i %f V-warp <center weight> <band width > 

34 &warp,&centerWeight,&normalBandWidth); 
3 5 Def O ption ( M -separate " , " -sepa rate " , &sepa rate) ; 

36 DefOption( l *-normaii2e* , / M -normalize*',&lengthNormali2e); 

37 DefOptionC-topToBottom % f " -to pTo Bottom 
< ratio >\&topToBottomOption,&topToBottoiTi); 

38 DefOption( ,, -hilIToValley%f H / , -hillToValley 
<ratio>" f &hillToValleyOption,&hillToValleyLocal); 

39 ScanArgs(argc,argv); 
40 

41 dictl = ReadDictionary(dictFiiel); 

42 dict2 = ReadDictionary(dictFile2); 
43 

44 if ((shapel > = dict1->numberOf Entries) || (shapel < 0) || 

45 (shape2 >= dict2->numberOfEntries) ||(shape2 < 0)) 

46 DoError( M %s: bad shape number.\n*\argv[OJ); 
47 

48 * if((pathFP = fopen(outFile/'w ,, ))==NULL) 

49 DoError("single: error opening output file %s.\n",outFile); 
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50 

51 if (useL2){ 

52 fprintf(stdout, "Using L2 on length normalized shapesAn"); 

53 dd.diffType = 12; 

54 } 

55 else if (slopeConstrain) { 

56 fprintf(stdout, M Using dynamic time warping with slope contrained to [0.5 ( 2].\n M ); 

57 dd.diffType = CONSTRAINED; 

58 dd.separate = separate; 

59 if (separate) 

60 fprintftstdout/Top and bottom warped separatelyAn"); 

61 else 

62 fprintf(stdout/Top and bottom warped togetherAn"); 

63 } 

64 else( 

65 fprintf(stdout/ Using dynamic time warping with bandwidth %dAn",normalBandWidth); 

66 dd.diffType - WARP; 

67 dd.bandWidth = norma IBandWidth; 

68 dd.separate = separate; 

69 if (separate) 

70 fprintf(stdout/'Topand bottom warped separatelyAn"); 

71 else, 

72 fprintf(stdout/Top and bottom warped togetherAn"); 

73 } 

74 Ef(!useL2){ 

75 fprintf(stdo ut M Center weight = %fAn" ( centerWeight); 

76 dd.centerWeight = centerWeight; 

77 if (length Normalize) { 

78 dd.lengthNormalize = TRUE; 

79 fprintf(stdout, "Scores normalized by signal lengthAn"); 

80 } 

81 else 

82 dd.lengthNormalize = FALSE; 

83 } 

84 dd.hillToValley = hillToValleyLocal; 

85 dd.topToBottom = topToBottom; 

86 dd.pathFP = pathFP; 

87 fprintf(stdout,"Top to bottom ratio = %6.2f An'^topToBottom); 

88 fprintf(stdout/Hill to Valley ratio = % 6. 2f An " ,hiilTo Valley Local) ; 
89 

90 score = DiffPair(*(dict1->outlines+shape1), 

9 1 *(d ict2- > outlines + shape2), 

92 &dd); 
93 

94 fclose(pathFP); 
95 

96 printf( u Score = %f\n\score); 

97 } 
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Jul 23 20:24 1991 slopeMatch.c 



float SlopeConstrainedMatch(float *a1 /float *a2Jnt aLength, 
float *b1 ,f loat *b2,int bLength, 
float maxSlope) 

{ 

float costs[MAXJIGNALJ.ENGTH][MAX_S!GNAL_LENGTH]; 
char down[MAX SIGNAL_LENGTH][MAX_SIGNAL_LENGTH]; 
charleftlMAX SIGNAL LENGTH] [MAX SIGNAL LENGTH]; 

} 
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Jul 12 14:36 1991 sortMatrixx 

1 #inc)ude <stdio.h> 

2 #include"error.h* 

3 ^ndude-picth'' 
4 

5 #define MAX_ENTRIES 5000 
6 

7 typedef struct { 

8 float score; 

9 intx; 

10 inty; 

11 } *CompareTuple,CompareTupleBody; 

13 int TupleLessThan(CompareTuple *x,CompareTuple *y) 

15 if {(*x)-> score = = (*y)-> score) 

16 return 0; 

17 else if ((*x)->score < (*y)->score) 

18 return -1; 

19 else 

20 return 1; 

21 } 
22 

23 void PrintTuple(CompareTuple a,FILE *fp) 

24 { 

25 fprintf(fp/'{°/od, 0 /od): %f\n",a->x,a->y,a-> score); 

26 } 
27 

28 void main(int argcchar **argv) 

29 { 

30 Picture pict; 

31 inti,j; 

32 int x,y; 

33 char*infile; 

34 CompareTuple scores[MAX_ENTRIES]; 

36 if (argc J= 2) 

37 DoErrorfUsage: %s inf ile.\n" ( argv[0j); 

38 infile = argv[1J; 
39 

40 pict = loadj3ict(infile); 

41 if (pict-> width*pict->height > MAX_ENTRIES) 

42 DoError("%s: matrix has too manyt entries.\n",argW0)); 
43 

44 for {y=0,i=0;y<pict-> height; + +y) 

45 for (x= 0;x< pict- > width; + +x) { 

46 CompareTuple temp; 

47 temp = (CompareTuple)calloc(1,sizeof(CompareTupleBody)); 

48 if (temp == NULL) 

49 DoError( M %s: cannot allocate space.\n M ,argv[01); 

50 temp->score = *((float *){pict-> data) + x + y*pict-> width); 

51 temp->x = x; 

52 temp->y = y; 
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53 scoresli] = temp; 

54 + -H; 

55 } 

56 qsort(scoresAsizeof(CompareTuple)JupleLessThan); 

57 for(j-0;j<i; + +j) 

58 PrintTupIe(scoresQ] f stdout); 

59 } 
60 
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Aug 26 17:54 1991 Makefile 



1 CCFLAGS = -g -c-l/net/piglet/pigleMc/hopcroft/new/include 
2 

3 OFUNS = btobify.o orient.o lines.o newBaselines.o newMain.o types.o \ 

4 newBlobify.o boxes.o newContour.o numbers.o f ontNorm.o \ 

5 dicto 
6 

7 ALPHAOFUN5 = oriento lines.o baselines.o newMain.o types.o \ 

8 blobify.o boxes.o newContour.o numbers.o alphaNorm.o \ 

9 dict.o 
10 

11 

1 2 SOURCES = Makefile baselines.c blobify.c boxes.c diet c dmain.c getAll.c\ 

13 getOutline.c Mnes.c newContour.c newDiff2.c newMain.c\ 

14 numbers.c orient.coveriay.c fontNorm.c testFine.ctypes.c 
15 

16 EXTRNS = /net/piglet/piglet-1 c/hopcroftyerror/error.o\ 

17 /net/piglet/piglet-1c/hopcroft/new/pict/pict.o \ 

18 /net/piglet/pigleMc/hopcroft/lists/lists.o 
19 

20 INCLUDE = /net/pig let/pigleMc/hopcroft/new/include/ 

21 MISC = ${INCLUDE)misc.h 

22 BOOLEAN = $(INCLUDE)boolean.h 

23 LINES = $(INCLUDE)lines.h 

24 LISTS = ${iNCLUDE)iists.h 

25 PICT = $(INCLUDE)pict.h 

26 TYPES = $<lNCLUDE)types.h 

27 MYUB = $(INCLUDE)mylib.h 

28 ORIENT = $(INCLUDE)orient.h 

29 BASELINES = $(INCLUDE)baselines.h 

30 BLOB IF Y =* $(INCLUDE)blobify.h 

31 BOXES = $(INCLUDE)boxes.h 

32 CONTOUR = $(INCLUDE)newContour.h 

33 DIFF - $(INCLUDE)diff.h 

34 DICT = $(INCLUDE)dicth 

35 ERROR = $(INCLUDE)error.h 

36 FONTNORM = $(INCLUDE)fontNorm.h 
37 

38 orient: $(OFUNS) 

39 gec $(OFUNS) $(HOME)/new/lib/mylib.a /usr/lib/debug/malloc.o -Im -o $@ 
40 

41 newBlobify: newBlobify.o 

42 gec newBlobify.o ../lib/myltb.a -Im -o $@ 
43 

44 makeAlphabet: $(ALPHAOFUNS) 

45 gec S(ALPHAOFUNS) /usr/lib/debug/malloc.o $(EXTRNS) -im -o $@ 
46 

47 overlay: overlay.o 

48 gec overlay.o $(EXTRNS) -o $@ 
49 

50 testFine: testFine.o lines.o guassian.o types.o 

51 gec testFine.o lines.o guassian.o types.o $(EXTRNS) -Im -o $@ 
52 
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53 boxes: boxes.o lines.o types. o 

54 gcc boxes.o lines.o types.o $(HOME)/new/lib/mylib.a -Im -o $@ 
55 

56 getOutline: dicto getOutline.o 

57 gcc getOutline.o dict.o $(EXTRNS) -Im -o $@ 
58 

59 getAII: dicto getAll.o 

60 gcc getAll.o dict.o $(EXTRNS) -Im -o $@ 
61 

62 maxFilter: maxFilter.o 

63 gcc maxFilter.o $(HOME)/new/lib/mylib.a -Im -o $@ 
64 

65 myWc: myWc.o 

66 gcc myWco $(EXTRNS) -o $@ 
67 

68 printCode: $(SOURCES) 

69 /usr/5bin/pr -n3 S(SOURCES) | Ipr -PWeeklyWorldNews 
70 

71 rewBaselines.o: newBaselines.c $(BOOLEAN) $<PICT) SfTYPES) $(LISTS) $(LINES)\ 

72 $(BASELINES) 

73 gcc $(CCFLAGS) newBaselines.c 
74 

75 blobify.o: blobify.c S( BOO LEAN) $(PICT) S(BLOBIFY) 

76 gcc $(CCFLAG5) blobify.c 
77 

78 boxes.o: boxes.c S(BOOLEAN) $(PICT) $(TYPES) $(BOXE5) 

79 gcc ${CCFLAG5) boxes.c 
80 

81 dict.o; dict.c$(BOOLEAN)S(TYPES) $(ERROR) $(PICT) $(DICT) 

82 gccS{CCFLA6S)dict.c 
83 

84 dmain.o: dmain.c S(BOOLEAN) $(PICT) $(DIFF) 
* 85 gcc $(CCFLAGS) dmain.c 

86 

87 getAll.o: getAII. c $(BOOLEAN) $(TYPES) ${PICT) $(D!CT) 

88 gcc$(CCFLAGS)getAll.c 
89 

90 getOutline.o: getOutline.c$(BOOLEAN) $(TYPES) $(PO) $(DICT) 

91 gcc$(CCFLAGS)getOutline.c 
92 

93 guassian.o: guassian.c 

94 gcc S(CCFLAGS) guassian.c 
95 

96 lines.o: lines.c S(BOOLEAN) S(PICT) $(LINES) 

97 gcc $(CCFLAG5) lines.c 
98 

99 maxFilter.o: maxFilter.c $(MYLIB) 

1 00 gcc $(CCFLAGS) maxFilter.c 
101 

1 02 myWco: myWcc $(BOOLEAN) $(ERROR) 

103 gcc$(CCFLAGS)myWc.c 
104 

105 newBlobify.o: newBlobify.c S(MYLIB) $(BLOBIFY) 

1 06 gcc $(CCFLAGS) newBlobify.c 
107 
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108 newContour.o: newContour.c $(BOOLEAN) $(PICT) $(TYPES) $(UNES) \ 

1 09 $(LISTS) $(CONTOUR) $(FONTNORM) 

110 gcc $(CCFLAGS) newContour.c 
111 

1 1 2 newDiff2.o: newDiff 2.c $(BOOLEAN) $(TYPES) $(PICT) $(DIFF) 

113 gcc $<CCFLAGS) newDiff 2.c 
114 

1 1 5 newMain.o: newMain.c $(BOOLEAN) $(PICT) $(LISTS) $(LINES) \ 

1 16 $(ORIENT)$(BASELINES)$(BLOBIFY)$(BOXES)$(CONTOUR) ${ORIENT) 

117 gcc $(CCFLAGS) newMain.c 
118 

1 19 numbers.o: numbers.c $(BOOLEAN) ${P!C1) $(LINES) 

1 20 gcc ${CCFLAGS) numbers.c 
121 

1 22 orient.o: orientc $(BOOLEAN) $(TYPES) $(Pia) $(ORIENT) ${LINES) 

123 gcc $(CCFLAGS) orientc 
124 

1 25 overlay.o: overlay.c S(BOOLEAN) $(PICT) 

1 26 gcc $<CCFLAGS) overlay.c 
127 

128 postproco: postproc.c$(BOOLEAN) ${TYPES) $(ERROR) $(PICT) $(D!C7) 

1 29 gcc S(CCFLAGS) postproc.c 
130 

131 . alphaNorm.o: alphaNorm.c $(BOOLEAN) ${TYPES) $(ERROR) $(PICT) $(DICT) $(FONTNORM) 

132 gcc $(CCFLAGS> alphaNorm.c 
133 

1 34 f ontNorm.o: f ontNorm.c $(BOOLEAN) $(TYPES) S(ERROR) $(PICT) $(DICT) $(FONTNORM) 

135 gcc$(CCFLAGS) fontNorm.c 

136 v 

137 testFine.o: testFine.c 

1 38 gcc $(CCFLAGS) testFine.c 
139 

140 types.o: types.c$(TYPES)$(ERROR) 

141 gcc $(CCFLAGS) types.c 
142 
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Aug 515:451991 alphaNorm.c 

1 #include <stdio.h> 

2 #include <math.h> 

3 #include "boolean.h" 

4 ^include "types.h" 

5 #include "error.rT 

6 #include "plct.rT 

7 #include "dict.h" 

8 #include"fontNorm.h" 
9 

10 /* This file is just like fontNorm.c, but assumes that the input is data for an alphabet 
dictionary. 

11 * This data is 
12 

13 ' *a-z 
14 

15 . *A-Z 
16 

17 *0-9 
18 

19 *|!@#$ c /o-&*0 + \-=0n;:/<>? 

20 * 
21 

22 
23 

24 * 

25 * The x height will be measured from the x(23). The ascender height will be measured 

26 * from the 1(11). 

27 */ 
28 

29 #def ine X_HEIGHT_SHAPE 23 

30 #def ine ASC^EIGHTSHAPE 1 1 
31 

1 32 extern double ceil(double); 
33 extern int irint(double); 
34 
35 

36 #defineUP0 

37 #def ine DOWN 1 

38 typedef int Direction; 
39 

40 extern Picture thePict; 
41 

42 void StoreRawOutlinePair(Dictionary diet, int dictEntry, 

43 Box box,int *bothX # int *topY, int *baseY, 

44 int numberOfLegs) 

45 { 

46 RawOutlinePair temp; 

47 int i; 

48 int *xCur$or,*topCursor,*bottomCursor; 
49 

50 temp = (RawOutlinePair)calloc(1,sizeof(RawOutlinePairBody)); 

51 if (temp = = NULL) 
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52 DoErrorCStoreRawOutlinePair; cannot allocate space\n w ,NULL); 
53 

54 temp->box = box; 

55 temp->numberOfLegs = numberOfLegs; 
56 

57 temp->x = (int *)calloc(temp->numberOfLegs,sizeof(int)); 

58 temp->top = (int *)calloc(t€mp->numberOfLegs l sizeof(int)); 

59 temp- > bottom = (int *)calloc(temp->numberOfLegs,sizeof(int)); 

60 if ((temp->x = = NULL) || 

61 (temp->top - = NULL) || 

62 (temp->bottom = = NULL)) 

63 DoError("StoreRawOutlinePair: cannot allocate space\n w ,NULL); 
64 

65 xCursor = temp->x; 

66 topCursor = temp->top; 

67 bottomCursor = temp- > bottom; 
68 

69 for(i=0;i<numberOfLegs; + +i){ 

70 *xCursor+ + = *bothX+ +; 

71 *topCursor+ + = *topY+ +; 

72 * bottomCursor + + = *baseY+ +; 

73 } 

74 *(dict->rawOutlines+dictEntry) = temp; 

75 } 
76 

77 int RawOutlineWidth(RawOut!inePair a,int middleLine) 

78 { 

79 int i,numberOf Legs,right,left; 

80 int *topCursor,* bottomCursor; 

81 inttopValue,bottomValue; 

83 numberOfLegs = a->numberOfLegs; 
84 

85 topCursor = a->top; 

86 bottomCursor = a-> bottom; 

87 f or (i = 0; i < numberOfLegs; + -H) { 

88 topValue = *topCursor+ + ; 

89 bottomValue = *bottomCursor + +; 
90 

91 if (topValue ! = HIT_TH E_BOX){ 

92 topValue = middleLine -topValue; 

93 if (topValue <0) 

94 topValue = 0; 

95 } 

96 else 

97 topValue = 0; 
98 

99 if (bottomValue I = HITTHE_BOX) { 

100 bottomValue = bottomValue - middleLine; 

101 if (bottomValue < 0) 

102 bottomValue = 0; 

103 } 

104 else 

105 bottomValue = 0; 
106 
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107 if ((bottomValue != 0)H(topValue l= 0)) 

108 break; 

109 } 

110 left = t; 
111 

112 topCursor = a->top+numberOflegs-1; 

113 bottomCursor = a- > bottom -fnumberOf Legs- 1; 

114 for (i= numberOfLegs-1;i> =0;-i) { 

115 topValue = *topCursor--; 

116 bottomValue = * bottomCursor-; 
117 

118 if (topValue 1= HITTHE_BOX) { 

119 topValue = middleLine- topValue; 

120 if (topValue<0) 

121 topValue = 0; 

122 } 

123 else 

124 topValue = 0; 
125 

126 if (bottomValue ! = HIT_THE_BOX) { 

127 bottomValue = bottomValue - middleLine; 

128 if (bottomValue < 0) 

1 29 bottomValue = 0; 

130 } 

131 else bottomValue = 0; 
132 

133 if ({topValue != 0)||(bottomValue ! = 0)) 

134 break; 

135 } 

136 right = i + 1; 
137 

138 return right-left; 

139 } 
140 

141 void ResampleOutlinePair(OutlinePair afloat newTo Old Factor) 

142 /* Resample an outline pair using linear interpolation. */ 

143 { 

1 44 int newWidth, oldWidthJ; 

145 intoldLeft,oldRight; 

146 float oldCenter; 

1 47 float * newX/newTop/ newBottom; 

148 float *xCursor,*topCursor/ bottomCursor; 
149 

150 oldWidth = a->numberOfLegs; 

151 newWidth = irint(newToO!dFactor*oldWidth); 
152 

1 53 newX = (float *)calloc(newWidth l si2eof(f loat)); 

1 54 newTop = (float *)cailoc(newWidth,sizeof (float)); 

1 55 newBottom = (float *)calloc(newWidth,sizeof(float)); 

156 if ((newX= = NULL)H(newTop = =NULL)||(newBottom= =NULL)) 

157 DoErrorCResampleOutlinePair: cannot allocate space.\n M ,NULL); 
158 

159 xCursor = newX; 

160 topCursor = newTop; 

1 61 bottomCursor = newBottom; 
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1 63 f or (i = 0; i < newWidth ; + + i) { 

164 oldCenter = i/(float)newWidth*(float)o!dWidth; 

165 otdLeft = irint(floor(o!dCenter)); 

166 oldRight = irint(ceil(oldCenter)); 

167 if (oldLeft== oldRight) { 

168 *xCursor+ + = *(a->x+oldLeft); 

169 *topCursor+ + = *(a->top+oldLeft); 

170 *bottomCursor+ + = *(a->bottom+oldLeft); 

171 } 

172 else{ 

173 float slope; 

174 slope = *(a->x+oldRight)-*(a->x+oldLeft); 

175 *xCursor + + = *(a->x+oldLeft) + (oldCenter-oldLeft)*slope; 

176 slope = *(a->top+oldRight)-*(a->top+oldLeft); 

177 *topCursor+ + = *<a->top + oldLeft) + (oldCenter-oldLeft)*slope; 

178 slope = *{a->bottom+oldRight)-*(a->bottom+oldLeft); 

179 *bottomCursor+ + = *(a->bottom+oldLeft) + (oldCenter-oldLeft)*slope; 

180 } 

181 } 
182 

183 free(a->x); 

184 free(a->top); 

185 free(a-> bottom); 
186 

187 a->x= newX; 

188 a->top = newTop; 

189 a- > bottom = newBottom; 

190 a->numberOfLegs = newWidth; 

191 } 
192 

193 void StoreOutlinePair(Dictionary diet, int dictEntry, 

194 intmiddleLineJntfontXHeight, 

195 intascenderHeight) 

1 96 /* This routine normalizes the raw outline pair stored in diet at dictEntry using the following 

197 * operations: 

1 98 * 1) For the top contour, shift so that the middle line is at y = 0 and negate so that the 

1 99 * higher points are greater than 0. For the bottom, shift so that middle line is at y = 0, 

200 * but don't flip. Thus, lower points have y coordinates greater than 0, 

201 * Consider points whose value is HIT_THE_BOX to be at y= 0. These correspond to gaps 

202 * between the letters. 

203 * 2) Compress top and bottom y coordinates by 1/fontXHeight so that the coordinates at 
the 

204 * distanceof the fontXHeight have value 1. Notethatl is an arbitrary number. Itis 

205 * unlikely that a signal will have parts that are the x height above the center line 

206 * anyway. 

207 * FOR TOP CONTOUR, 

208 * IF HEIGHT IS GREATER THAN XHEIGHT, SCALE DIFFERENCE BY 1.5/ASCENDER HEIGHT. 

209 * ELSE SCALE DIFFERENCE BY 1 /XHEIGHT. 

210 * FOR BOTTOM CONTOUR, 

211 * SCALE BY 1.5/ASCENDER.HEIGHT. 

212 * 3) Compress the x coordinates by the same factor as in step 2. Note that this does not 

213 * actually resample the contour. NOW DO THIS WITH RESAMPLE. USE SCALE FACTOR OF 

214 * 20/XHEIGHT. 

215 * 4) Remove left and right ends of the contour that have y values of zero. This is so the 
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216 * contour starts where the word starts, rather than at the edge of its bouding box. 

217 * 5) Resample the contour to stretch by firstFontXwidth/fontxWidth. KILL THIS 
OPERATION. 

218 */ 

219 { 

220 RawOutlinePair raw; 

221 OutlinePair temp; 

222 int i,numberOf Legs; 

223 inty; 

224 int offset; 



225 int *xSCursor,*topSCursor,*bottomSCursor; 

226 float *xDCursor,*topDCursor,*bottomDCursor; 

227 float *xCursor,*topCursor,*bottomCursor; 

228 intleft,right; 

229 float foffset; 
230 

231 raw = *(dict->rawOut!ines+dictEntry); 
232 

233 temp = (OutlinePair)ca!loc(1,sizeof(OutlinePairBody)); 

234 if (temp = = NULL) 

235 DoError( M StoreOutlinePair: cannot allocate space\n'\NULL); 
236 

237 temp->x - (float *)calloc(raw->numberOfLegs,sizeof(float)); 

238 temp->top = (float *)ca!loc(raw->numberOfLegs,sizeof (float)); 

239 temp->bottom = (float *)calIoc(raw->numberOfLegs,sizeof(float)); 

240 if ((temp-> x = = NULL) || 

241 (temp->top = = NULL) || 

242 (temp-> bottom = = NULL)) 

243 DoError("5toreOutlinePair: cannot allocate space\n", NULL); 
244 

245 temp->box = raw->box; 

246 temp->blackoutHeight = 0; 

247 temp- >numberOf Legs = raw- >numberOf Legs; 

248 offset = temp- > offset = *(raw->x); 

249 temp-> width = *(raw- >x+ raw- > numberOfLegs-1 )- temp- > offset; 
250 

251 xDCursor = temp->x; 

252 topDCursor = temp- > top; 

253 bottomDCursor = temp- > bottom; 

254 xSCursor = raw->x; 

255 topSCursor = raw- > top; 

256 bottomSCursor = raw- > bottom; 
257 

258 numberOfLegs = raw-> numberOfLegs; 

259 f or (i = 0; i < numberOfLegs; + + i) { 

260 /* *xDCursor+ + = (float)(*xSCursor+ + - offset)/fontXHeight; */ 

261 if (*topSCursor= = HIT THE BOX) { 

262 y = 0; 

263 topSCursor++; 

264 } 

265 else { 

266 y = middleLine - *topSCursor+ +; 

267 if (y<0) 

268 y = 0; 
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269 } 

270 if(y>fontXHelght/2) 

271 *topDCursor+ + = (float)y *1.5 / ascenderHeight; 

272 else 

273 *topDCursor+ + = (f loat)y / fontXHeight; 
274 

275 if (*bottomSCursor= = H!T THE BOX) { 

276 y = 0; 

277 bottomSCursor+ + ; 

278 } 

279 else { 

280 y = *bottom$Cursor+ + - middleLine; 

281 if(y<0) 

282 y = 0; 

283 } 

284 if(y>fontXHeight/2) 

285 *bottomDCursor+ + = (float)y/ fontXHeight; 

286 else 

287 *bottomDCursor+ + = (float)y * 1.5 / ascenderHeight; 

288 } 
289 

290 /* Now try to remove parts of the contour on to the left and right of the 

291 * word shape that are at height 0 */ 
292 

293 /* Find left edge*/ 

294 topDCursor = temp- > top; 

295 bottomDCursor = temp- > bottom; 

296 for (i = 0;i<numberOf Legs; + +i){ 

297 if ((*topDCursor+ + != 0)||(*bottomDCursor+ + ! = 0)) 

298 break; 

299 } 

300 left = i; 
301 

302 /* Find right edge */ 

303 topDCursor = temp->top + numberOfLegs-1; 

304 bottomDCursor = temp- > bottom + numberOfLegs-1; 

305 for(i=numberOfLegs-1;i>=0H){ 

306 if ((*topDCursor- ! = 0)||(*bottomDCurscr- ! = 0)) 

307 break; 

308 } 

309 right = 
310 

311 /* Clip the ends of the contour at left and right */ 

312 xDCurso r = temp- > x; 

313 topDCursor = temp->top; 

314 bottomDCursor = temp-> bottom; 

315 xCursor = temp->x+left; 

316 topCursor = temp- > top + left; 

317 bottomCursor = temp-> bottom + left; 

318 foffset = *xSCursor; 

319 for (i = left; i< right; + + i){ 

320 *xDCursor+ + = *xCursor+ + - foffset; 

321 *topDCursor+ + = *topCursor+ + ; 

322 *bottomDCursor+ + = *bottomCursor-f +; 

323 } 
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324 temp- >numberOf Legs = right-left; 
325 

326 *(dict->outlines+dictEntry) = temp; 

327 ResampleOutlinePair(*(dict->outlines+diaEntry),(float)20/(fIoat)fontXHeight); 

328 } 
329 

330 static int lineSpacing; 

331 int OrderOutiinePair(OutlinePair *o1 l OutlinePair *o2) 

332 { 

333 int yDistance; 

334 intxDistance; 

335 yDistance = (*o1)->box->pageY-(*o2)->box->pageY; 

336 if (yDistance < lineSpacing && yDistance > -lineSpacing) { 

337 xDistance = (*o1)->box->pageX-(*o2)->box->pageX; 

338 return xDistance; 

339 } 

340 return yDistance; 

341 } 
342 

343 void SortDictionary(Dictionary diet) 

344 { 

345 lineSpacing = 20; 

346 qsorttdict^rawOutlines.dict^numberOfEntries^izeofCRawOutlinePair), 

347 OrderOutlinePair); 

348 } 
349 

350 /* WARNING - assumes at least on entry is not equal to HIT_THE_BOX */ 

351 float MaxTopValue(RawOutlinePair o) 

352 { 

353 inti; 

354 float maxValue; 

355 maxValue = *(o->top); 

356 for (i - 0; i < o- > numberOf Legs; + + i) 

357 if (*(o->top + i) > maxValue && (*o- > top + i) ! = HIT_THE_BOX) 

358 maxValue = *(o->top + i); 

359 return maxValue; 

360 } 
361 

362 /* WARNING - assumes at least on entry is not equal to HITTHE_BOX */ 

363 float MinTopValue(RawOutlinePair o) 

364 { 

365 inti; 

366 float minValue; 

367 minValue = *(o->top); 

368 for (i = 0;i<o-> numberOf Legs; + + i) 

369 if (*(o->top + i)<minValue&&(*o->top + i)! = HIT_THE_BOX) 

370 minValue = *{o->top + i); 

371 return minValue; 

372 ) 
373 

374 #defineHIST_SIZE 100 

375 void HistogramMax(int *data,int dataLength,intoffset,int signjnt *histogram) 

376 { 

377 inti,bin; 
378 
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379 if(sign>0){ 

380 intmaxVaiue; 
381 

382 maxValue = *data; 

383 for(i = 0;i<datal_ength; + +i) 

384 if(data[i]! = HIT_THE_BOX){ 

385 maxValue = data[i); 

386 break; 

387 } 

388 for (;i< dataLength;* + i) 

389 if (data[i] ! = HITJHEJSOX && data[i] > maxValue) 

390 maxValue = datafi]; 

391 if {maxValue != HIT_THE_BOX) { 

392 bin = maxValue-offset; 

393 if ((bin>=0)&&(bin<HI5T_SIZE)) 

394 histogram[bin]++; 

395 } 

396 } 

397 else{ 

398 int minValue; 

399 minValue = *data; 

400 for (i s= 0;i <dataLength; + H-i) 

401 if (data[i]! = HlTTHE_BOX){ 

402 minValue = data[i]; 

403 break; 

404 } 

405 for {; i <dataLength; + + i) 

406 if (data [i] ! = HIT_TH E JJOX && data [i]< minValue) 

407 minValue = datafi]; 

408 if (minValue l = HITJHE.BOX) { 

409 bin = min Value-off set; 

41 0 if ((bin > = 0)&&(bin < HIST.SIZE)) 

411 histogram!bin]+ + ; 

412 } 

413 } 

414 } 
415 

416 void Histogram(int *data,int dataLength, int offset, int histogram) 

417 ( 

418 inti,bin; 
419 

420 for (i = 0;i< dataLength; ++i){ 

421 if (*data I = HITJHE.BOX) { 

422 bin = *data-offset; 

423 if ((bin>=0)&&(bin<HIST_SIZE)) 

424 histogram[bin]+ + ; 

425 } 

426 data++; 

427 } 

428 } 
429 

430 int MaxBin(int *histogram) 

431 { 

432 inti; 

433 intmaxVaiue; 
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434 int maxlndex; 
435 

436 maxValue = histogram[0J; 

437 maxlndex = 0; 

438 for (i = 0; i < HISTSIZE; + + i) 

439 if (histogram[i]>maxValue){ 
440" maxVaiue = histogram[i]; 

441 maxlndex = i; 

442 } 

443 return maxlndex; 

444 } 
445 

446 int MaxBinAbove(int *histogram,int line) 

447 { 

448 int i; 

449 int maxValue; 

450 int maxlndex; 

451 inttop,bottom; 
452 

453 for(i = 0;i<HlSTSIZE; + +i) 

454 if (histogram^ != 0) 

455 break; 
456 

457 top = i; 

458 bottom = (line+top)/2; 
459 

460 maxValue = histogram[top]; 

461 maxlndex = top; 

462 f or (i = top; i < = bottom; + + i) 

463 if (histogram [i]> maxValue) { 

464 maxValue = histogram[i]; 

465 maxlndex = i; 

466 } 

467 return maxlndex; 

468 } 
469 

470 void DrawTextLines(Picture thePict,Dictionary dict,int topl_ine,int bottomLine) 

471 { 

472 intmaxLength; 

473 int halfWidth; 

474 intx ( y; 

475 float x2,x3,y2,y3; 

476 float angle; 
477 

478 angle = (*(dict-> rawOutlines))- > box- > angle; 

479 maxLength = thePict-> width + thePict-> height; 

480 halfWidth = thePict-> width / 2; 

481 x = topLine * -sin(angle) + halfWidth * cos(angle); 

482 y = topLine * cos(angle) + halfWidth * sin(angle); 

483 x2 = x + maxLength*cos(angle); 

484 y2 = y + maxLength*sin(angle); 

485 x3 = x-maxLength*cos(angle); 

486 y3 = y-maxLength*sin(angle); 

487 DrawLine(thePict,x # y,(int)x2,(int)y2 f 5); 

488 DrawLine(thePirt # x,y,(int)x3 ( (int)y3,5); 
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489 

490 x = bottomLine * -sin(angle) + halfWidth * cos(angle); 

491 y = bottomLine * cos(angle) + halfWidth * sin(angle); 

492 x2 = x-l-rnaxLength*cos(angle); 

493 y2 = y+ maxLength*sin(angle); 

494 x3 - x-maxLength*cos(angle); 

495 y3 = y-maxLength*sin(angle); 

496 DrawLine(thePict,x,y # (int)x2,(int)y2,5); 

497 DrawLine(thePicU,y,(int)x3,(int)y3 # 5); 

498 } 
499 

500 void PageStatistics(Dictionary dict,char *fileName) 

501 /* WARNING - this must be run before PostProcess since PostProcess changes the raw 

502 * shape data.*/ 

503 { 

504 int index; 

505 int temp; 

506 int ^startlndex^ irstY,minY,endlndex,shape; 

507 inttops[HISTSIZE]; 

508 int bottoms[HISTSIZE]; 

509 intascenders[HIST_SIZE]; 

510 intdescenders[HIST_SJZE]; 

511 int middleLine,topLine,bottomLine ( ascenderLine,descenderLine; 

512 int ascenderHeight,descenderHeight,lineNumber; 

513 int f ontXHeig ht,f ontXWidth # x Index; 

514 RawOutlinePair thisShape; 

515 FILE *fp; 

516 BOOLEAN haveFirstFontXWidth = FALSE; 

517 intfirstFontXWidth; 
518 

519 if ({fp~fopen(fi!eName, H w")) = =NULL) 

520 DoError( u PageStatistics: error opening output file 0 /osAn",fileName); 
521 

522 SortDictionary(dict); 
523 

524 index = 0; 

525 lineNumber = 0; 

526 while (index < diet- >numberOf Entries) { 

527 startindex = index; 

528 firstY = (*(dict->rawOutlines+ index))- > box- >pageY; 

529 minY = firstY; 

530 while((*(dict->rawOutlines+index))->box->pageY -firstY < 20 && 

531 (*(dict->rawOutlines+index))->box->pageY-firstY > -20){ 

532 if (minY X(*(dict->rawOutlines+index))->box->pageY)) 

533 minY = (* (d ict->rawOutlines+ index))- > box- >pageY; 

534 ++ index; 

535 if (index == dict->numberOf Entries) 

536 break; 

537 ) 

538 endlndex = index; 
539 

540 

541 /* shapes from start index through endindex are all on */ 

542 /* the same text line */ 

543 /* minY has the top of the highest box on the line. */ 
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545 /* Find the base and toplines by taking the mode of the heights of the 

546 * valleys of the bottom contours and the peaks of the top contours */ 

547 for(i = 0;i<HIST_SIZE;i++){ 

548 bottoms[i]=0; 

549 } 
550 

551 for (shape=startlndex;shape<endlndex; + +shape) { 

552 thisShape = *(dict->rawOutlines+ shape); 

553 Histogram(thisShape->bottom,thisShape->numberOfLegs f minY f bottoms); 

554 } 

555 bottomLine = MaxBin(bottoms)+minY; 

556 if (X_HEIGHT_SHAPE> = startlndex&&XJ*EIGHT_SHAPE<endlndex) { 

557 topLine = MinTopValue(*(dict->rawdutlines+X_HE^GHT_SHAPE)); 

558 fontXHeight = bottomLine - topLine; 

559 } 

560 if (ASC_HEIGHT_SHAPE> =startlndex&&ASC_HEIGHT_SH APE < end Index) { 

561 ascenderLine = MinTopValue(*(dict->rawdutlines+ASC„HEIGHT - SHAPE)); 

562 ascenderHeight = bottomLine - ascenderLine; 

563 } 

564 middleLine = bottomLine-fontXHeight/2; 

565 topLine = bottomLine-fontXHeight; 
566 

567 if (thePict) 

568 DrawTextUnes(thePict,dirt,topLine,bottomLine); 
569 

570 

571 fprintf(fp, M %d: %d %d %2.6f\n^ineNumber/fontXHeight,ascenderHeight, 

572 (float)ascenderHeight/(float)fontXHeight); 
573 

574 for (shape = startlndex;shape<endlndex; + +shape) 

575 StoreOutlinePair(dict,shape ( middleLine # fontXHeightascenderHeight); 
576 

577 + + lineNumber; 

578 } /* Do another line of text */ 

579 fciose(fp); 

580 } 
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Aug 21 19:501991 baselines.c 

1 #include <stdio,h> 

2 #include <vatues.h> 

3 #include <math.h> 

4 #include M boolean.rT 

5 #include "pictrT 

6 #include "types.h" 

7 #include "lists.h M 

8 #include"lines.h H 

9 #include "baselines.h" 
10 

11 extern double sqrt(double); 

12 extern int irint(double); 
13 

14 /*inline*/ int NewReadPixel(UCHAR *base ( int width, float afloat y) 

15 { 

16 intxi; 

17 intyi; 

18 UCHARmask; 
19 

20 xi = irint(x); 

21 yi = irint(y); 

22 mask = 0x80 > > (xi & 0x7); 

23 return *(base+yi*width + (xi>>3)) & mask; 

24 } 
25 

26 void NewCountLine1Bit(Picturepictjntx1 f inty1,intx2 ( inty2 # int *black,int*blackEdge) 

27 { 

28 float x,y; 

29 float xinc,yinc; 

30 float xupinc,yupinc; 

31 float den; 

32 intb.be; 

33 int width.ucharWidth.height; 

34 UCHAR*data; 
35 

36 width = pict->width; 

37 ucharWidth = pict->uchar_width; 

38 height = pict-> height; 

39 data = pict->data; 
40 

41 den = Sqrt((y2-y1)*(y2-y1) + (x2-x1)*(x2-x1)); 

42 xinc = (x2-x1)/den; 

43 yinc = (y2-y1)/den; 

44 xupinc = -yinc; 

45 yupinc = xinc; 

46 x = x1; 

47 y = yi; 

48 

49 b = 0; 

50 be=0; 
51 

52 while (x< width&&x> = 0&&y < height&&y> = 0) { 
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53 + +b; 

54 if <NewReadPixel(data,ucharWidth,x,y)) { 

55 if (l(NewReadPixel(data,ucharWidth,x+xupinc,y+yupinc) && 

56 NewReadPixel(data,ucharWidth,x-xupinc,y-yupinc))) 

57 -i-H-be; 

58 } 

59 x + = xinc; 

60 y + = yinc; 
61 

62 } 

63 *black = b; 

64 *blackEdge = be; 

65 } 
66 

67 

68 #def ine MIN_BLACK 5 

69 float NewCountLine(Picture pict,int x1,int y 1,int x2,int y2) 

70 { 

71 int black, blackEdge; 

72 black = 0; 

73 blackEdge = 0; 

74 NewCountLine1Bit(pkt,x1 J y1,x2 / y2 i &black l &blackEdge); 

75 NewCountLine1Bit(pict,x1,y1,x1-(x2-x1) r y1-(y2-y1),&black,&blackEdge); 

76 if (black < MIN_BLACK) 

77 return 0; 

78 else 

79 return (float)blackEdge/black; 

80 } 
81 

82 static float x2offset; 

83 static float y2off set; 

84 static int projectlndex; 

85 static float * projection; 

86 static int *coordx; 

87 static int *coordy; 

88 BOOLEAN BaseLinePiston(Picture pict, int x ( int y, BOOLEAN test, UCHAR color) 

89 { 

90 if (test) { 

91 /* if (i(projectlndex%10)) 

92 DrawLine(pict,x l y,(int)(x + x2offset) l (int)(y+y2offset),0xff); */ 

93 /* WritePixel(pict,x,y,0xff); */ 

94 projectionlprojectlndex] = NewCountLine(pict,x,y,(int)(x+x2 offset), 

95 (int)(y + y2offset)); 

96 coordx[projectlndex] = x; 

97 coordy(projectlndex+ -f ] = y; 

98 return test; 

99 } else 

100 return test; 

101 } 
102 

103 static int lastX; 

104 static int lastY; 

1 05 BOOLEAN EndPointPiston(Picture pict, int x, int y, BOOLEAN test, UCHAR color) 

106 { 
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107 if (test) { 

108 lastX = x; 

109 lastY = y; 

110 } 

111 return test; 

112 } 
113 

1 14 void EndPoints(Picture pict,double angle.int *tx, int *ty,int *bx, int *by) 

115 { 

116 intxcyc; 

117 int maxLength; 

118 float normal; 

119 float X2,y2,x3,y3; 
120 

121 /* Make normal to text point in quadrants I and II */ 

1 22 /* Assume 0 <= angle < 2*M PI */ 

123 normal = fmod(angle + MJ>I/2,2*MJ>I); 

124 if (normal > M_PI) 

125 normal-=MPI; 
126 

127 xc= pict-> width/2; 

128 yc= pict-> height/2; 
129 

130 maxLength = pict-> width + pict-> height; 

131 x2 = xc+ maxLength*cos(normal); /* At bottom of picture */ 

132 y2 = yc+maxLength*sin(normal); 

133 x3 = xc-maxLength*cos(normal); /* At top of picture*/ 

134 y3 = yc-maxLength*sin(normal); 
135 

136 LineEngine(pict,xc,yc J (int)x2 ( (int)y2,0 l EndPointPiston); 

137 *bx = lastX; 

138 *by = lastY; 

139 LineEnginefpictxc^yc^intJxS^in^^.EndPointPiston); 

140 *tx = lastX; 

141 *ty = lastY; 

142 } 
143 

1 44 double distance(int xl ,int yl ,int x2,int y2) 

145 { 

146 return sqrt((double)((x1-x2)*(x1-x2)+(y1-y2)*(y1-y2))); 

147 } 
148 

149 #def ine BASEJ>ERC ENTILE 0.20 

150 #define MIN_LINE_HEIGHT_FRACTION 0.50 

151 List BaseLines(Picture pict t double angle r char *plotFile) 

152 #ifdeffoo 

153 ,int*count, 

154 int **returnCoordx, int **returnCoordy) 

155 #endif 

156 { 

157 float *topProjection; 

1 58 int *topCoordx,*topCoordy; 

159 int *fina1Coordx,*finalCoordy,*finallndex; 

160 inttoplndex # bottomlndex; 

161 inttopCount r botCount,finalCount; 
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162 int maxLength; 

163 intxcyc; 

164 float x2,y2,x3,y3; 

1 65 float maxValue,lastValue; 

166 int ij; 

1 67 float baseThresh; 

1 68 int topX,topY,bottomX,bottomY; 

1 69 BOOLEAN onTextLine; 

1 70 List xList # y List,result; 

171 double totalDistance.ave rage Distance; 

172 FILE *outfile; 
173 

174 printf("angle = %3.3f\n",ang!e); 
175 

176 maxLength = pict-> width +pict-> height; 
177 

178 topProjection = (float *)cal!oc(maxLength ( sizeof (float)); 

179 topCoordx = (int *)calloc(maxLength,sizeof(int)); 

180 topCoordy = (int *)calloc(maxLength,5izeof(int)); 

181 finalCoordx = (int *)calloc(maxLength,sizeof(int)); 

1 82 f inalCoordy = (int *)calloc(maxLength,sizeof(int)); 

183 finallndex = (int *)caIloc(maxLength f sizeof(tnt)); 
184 

185 if ((topProjection = = NULL)|| 

1 86 (topCoordx = = NULL)II 

187 (topCoordy = = NULL)|| 

188 (finallndex == NULL) || 

189 (finalCoordx = = NULL) || 

190 (finalCoordy = = NULL)) { 

191 printfCBaseLines: cannot allocate memory\n"); 

192 exit(-1); 

193 } 
194 

195 EndPointstpictangle.&topX^topY^&bottom^&bottomY); 
196 

197 printf("Main Line: (% d,%d)-(%d # %d) \n M ,topX,top Y,bottomX, bottom Y); 

198 /* DrawLine(picUopX/topY,bottomX>ottomY,Oxff); V 
199 

200 x2offset = maxLength*cos(angle); 

201 y2offset = maxLength*sin(angle); 

202 projectlndex = 0; 

203 projection = topProjection; 

204 coordx = topCoordx; 

205 coordy = topCoordy; 

206 LineEngine(pict,topX,topY,bottomX,bottomY,O f BaseLinePiston); 

207 topCount = projectlndex; 
208 

209 maxValue = topProjection(O); 

210 for(i = 0;i<topCount;++i) { 

211 if (topProjection(i]>maxValue) 

212 maxValue = topProjection[i3; 

213 } 
214 

215 baseThresh = maxValue*BASEJ>ERCENTILE; 

216 printf( M baseThresh = %3.3f\n ""baseThresh); 
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217 

218 /* Plot the baseline contour if requested */ 

219 if (plotFile!=NULL){ 

220 printf( u Opening baselines plot file\n"); 

221 if ((outfile = fopen(plotFile/w"))==NULL){ 

222 printf(" Error opening baseline plotfileAn"); 

223 exit(-1); 

224 } 

225 for (i = 0; i < topCou nt; + + i) 

226 fprintf(outfile,"%d %f\n M .i,topProjection[i]); 

227 fprintf(outfile,'VProjection\n\n 0 ); 

228 fp;intf(outfile, 

229 "0 %f\n%d %f%\n\ 4, Baseline Threshold^", 

230 baseThresh f topCount ( baseThresh); 

231 } 
232 

233 finalCount=0; 

234 lastValue = topProjectionltopCount-ll; 

235 onTextLine = FALSE; 

236 f or (i = 1 ; i < topCo unt; + + i) { 

237 if (onTextLine) { 

238 if (lastValue>baseThresh && top Project ion [i]< ==baseTh res h) ( 

239 finalCoordx[finalCount] = topCoordx(i); 

240 finalCoordy[finalCount] = topCoordyii]; 

241 finallndex[finalCount] = i; 

242 finalCount++; 

243 onTextLine = FALSE; 

244 } 

245 } else { 

246 if (lastValue< = baseThresh && topProjection[i]>baseThresh) { 

247 finalCoordxffinalCount] = topCoordx(i]; 

248 finalCoordy(finalCount] = topCoordyii]; 

249 finallndex[fina!Count] = i; 

250 fina!Count++; 

251 onTextLine = TRUE; 

252 } 

253 } 
254 

255 lastValue = topProjection[i]; 

256 } 

257 if (finalCount&1) 

258 -finalCount; /* Only take an even number of lines */ 

259 for (totalDistance = 0j = 0,j = 0;i <f inalCount; i + = 2) ( 

260 topX = finalCoordx[i]; 

261 topY = f inalCoordy[i]; 

262 bottomX = finalCoordx[i + 1]; 

263 bottomY = finalCoordy[i + 1]; 

264 totalDistance + = distance(topX,topY,b otto mX, bottom Y); 

265 j+=2; 

266 } 

267 averageDistance = totalDistance / (finaICount/2)*MIN_LINE_HEIGHT_FRACTION; 

268 for (i = 0 # j =0; i <f inalCount; i -f = 2) { 

269 topX = finalCoordx[i]; 

270 topY = finalCoordy[i]; 

271 toplndex = finallndex[i]; 
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272 bottomX = finalCoordx{i + 1J; 

273 bottomY = finalCoordy[i + 1]; 

274 bottomlndex = finallndex[i+ 1]; 

275 finalCoordx[j] = topX; 

276 finalCoordylj] = topY; 

277 finallndex[j] = toplndex; 

278 finalCoordx[j+ 1] = bottomX; 

279 finalCoordylj + 1] = bottomY; 

280 finallndex{j + 1] s= bottomlndex; 

281 if(distance(topX # topY,bottomX,bottomY)>averageDistance) 

282 j+ = 2; 

283 ) 

284 #ifdeffoo 

285 *count = j; 

286 *returnCoordx = finalCoordx; 

287 *returnCoordy = finalCoordy; 

288 #endif 

289 result = nil; 

290 for(i=j-1;i> = 0;-i){ 

291 push(MakePoint(finalCoordx[i],finalCoordy[i]),re5ult); 

292 } 
293 

294 if (plotFile!= NULL) { 

295 fprintf(outfile/\nO %f\n",-baseThresh); 

296 for(i = 0;i<j;in-=2){ 

297 fprintf(outfile, a %d %f\n%d %f\n%d %f\n%d %f\n", 

298 finallndex[i] # -baseThresh, 

299 finallndex[i],-2*baseThresh, 

300 finallndex[i + 1J,-2*baseThresh, 

301 finallndex[r + 1],-baseThresh); 

302 } 

303 fprintf(outfile/\ M Baselines"); 

304 fclose(outfile); 

305 printf ("Done writing baseline piotfiie.\n M ); 

306 } 
307 

308 return result; 

309 } 
310 

311 void DrawBaseLines(Picture pict, List pointList, double angle) 

312 #ifdeffoo 

313 int count,int *coordx,int *coordy,double angle) 

314 #endif 

315 { 

316 int maxLength; 

317 float X2,y2,x3,y3; 

318 intx,y; 

319 Point temp; 

320 maxLength = pict-> width + pict- > height; 

321 while (lendp(pointList)) { 

322 temp = pop(pointList); 

323 x = temp->x; 

324 y = temp->y; 

325 x2 = x+maxLength*cos(angfe); 

326 y2 = y + maxLength*sin(angle); 
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327 x3 = x-maxLength*cos(angie); 

328 y3 = y-maxLength*sin(angle); 

329 DrawLine(pict,x f y,(int)x2,(int)y2 l 0xff); 

330 DrawLine(pict > x f y,(int)x3 l (int)y3 f Qxff); 

331 } 

332 } 
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Jul 1 13:44 1991 blobify.c 

1 #lncfude <stdio.h> 

2 #include <math.h> 

3 #include "boolean.h" 

4 #include <, pict.h M 

5 #include "blobify.h" 
6 

7 static UCHAR bitmasksl] = {0x80,0x40,0x20,0x10,0x8,0x4,0x2,0x1}; 
8 

9 Picture Blobify(Pictureold,int half mask size,double threshold) 

10 { 

11 Picture new; 

12 intx,y; 

13 inttval; 

14 intleft,right,top,bottom; 

15 int width; 

16 int *counters; 

17 int*countptr; 

18 int mask_size; 

19 UCHAR *~xptr,*xyptr; 

20 int *leftptr; 

21 int *rightptr; 

22 UCHAR *topptr; 

23 UCHAR *bottomptr; 

24 int uchar_width; 

25 /* UCHARbitmask;*/ 

26 int count; 

27 int inside; 

28 intthold; 

29 /* Added the following for speedup hack 1/14/91 */ 

30 UCHAR bitMask; 

31 UCHAR *newCursor; 

32 UCHAR newValue; 

33 UCHAR topPixels; 

34 UCHAR bottomPixels; 
35 

36 

37 mask_size = 2 * half jnask_size + 1; 

38 /* uchar_width = ROUND8(old-> width) > > 3; V 

39 uchar_width = old->uchar width; 
40 

41 left = half_mask_size; 

42 right = old-> width - half mask_size - 1; 
43 

44 top = halfjnask_size; 

45 bottom = old-> height -half mask size-1; 
46 

47 

48 tval = floor(4*half w mask_size*half.mask_size*threshold); 

49 new = new_pict(old->width,oid->height,old-> depth); 
50 

51 counters = (int *)calloc(old->width l sizeof(int)); 
52 
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53 width = old- > width; 

54 countptr = counters; 

55 * xptr = old- > data; 

56 bitMask = 0x80; 

57 for (x=0;x<width; + +x) { 

58 /* bitmask = bitmasks[x%8]; */ 

59 xyptr = xptr; 

60 for(count=0,y=0;y<mask size; + +y){ 

61 if (*xyptr& bitMask) 

62 + + count; 

63 xyptr += uchar width; 

64 } 

65 *(countptr+ + ) = count; 

66 /* if(x%8==7) 

67 * ++xptr; */ 

68 if (bitMask == 0x01) { 

69 bitMask = 0x80; 

70 ++xptr; 

71 } 

72 else 

73 bitMask = bitMask >> 1; 

74 } 
75 

76 for (y=top;y<= bottom; + +y) { 

77 countptr = counters; 

78 for (inside =0,x=0;x<mask_size; + +x) 

79 inside + = * countptr* + ; 
80 

81 leftptr = counters; 

82 rightptr = counters + mask_size; 

83 newCursor = new->data+y*ucharjvidth + (left> >3); 

84 bitMask = bitmasks[left%8]; 

85 newValue = 0; 

86 for (x = left;x< = right; + +x) { 

87 if (inside>tval) 

88 /* set pixel*/ 

89 newValue |= bitMask; 

90 /* *<new->data+y*uchar width + (x> >3)) |= bitmasks[x%8]; */ 

91 if (bitMask == 0x01) { 

92 bitMask = 0x80; 

93 *newCursor+ + = newValue; 

94 newValue = 0; 

95 } 

96 else 

97 bitMask = bitMask >> 1; 

98 inside + = *rightptr+ + ; 

99 inside -= *leftptr+ + ; 

100 } 

101 if (bitMask ! = 0x80) { 

102 *newCursor = newValue; 

103 } 
104 

105 topptr = old->data + (y-half jriask_size)*uchar_width; 

106 bottomptr = topptr + mask_size*uchar_width; 

107 countptr = counters; 
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108 bitMask = 0x01; 

109 for (x=0;x<width; + +x) { 

1 10 /* bitmask = bitmasks[x%8]; */ 

111 if (bitMask == 0x01) { 

112 topPixels = *topptr+ +; 

1 13 bottomPixels = *bottomptr-f +; 

114 bitMask = 0x80; 

115 } 

116 else 

117 bitMask = bitMask >> 1; 

118 if (topPixels & bitMask) { 

119 if (KbottomPixels & bitMask)) 

120 ~(*countptr); 

121 } 

1 22 else if (bottomPixels & bitMask) 

123 + + (* countpU ) ; 

124 

125 ++countptr; 

126 } 

127 } 
128 

129 return new; 

130 } 
131 

132 #ifdeffoo 

133 void main(argcargv) 

134 intargc; 

135 char**argv; 

136 { 

137 char *infile,*outfile; 

138 Picture old,new; 

139 int half_mask_size; 

140 float threshold; 
141 

142 malloc debug{2); 
143 

144 if(argc!=5){ 

145 printfCUsage: %s infileoutfile half mask size threshold\n",arqv[0]); 

146 exit(0); 

147 } 

148 infile = argv[1]; 

149 outfile = argv[2]; 

150 half_mask_size = atoi(argv(3]); 

151 threshold = atof(argv[4J); 
152 

153 printfC'loading %s . . ."Jnfile); 

154 old = load_pict(infile); - 

155 new = components(old,half_mask_size,threshold); 

156 write pict(outfile,new); 
157 

158 } 

159 #endif 
160 
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Aug 26 18:10 1991 boxes.c 



1 #include <stdio.h> 

2 #include <values.h> 

3 #include <math.h> 

4 #indude "boolean.h" 

5 #include "picth" 

6 #include "types.h** 

7 #include "lists.rT 
8 

9 extern int irint(double); 
10 

11 #defineMAX QUEUE SIZE 10000 

12 #def ine BLACK 1 

13 #def ine WHITE 0 
14 

15 #def ine ABS(a) ((a)<0?-(a):(a)) 
16 

17 typedef Point PointArray; 
18 

19 typedef struct { 

20 PointBodyulclrc; 

21 }MinMaxBox; 
22 

23 typedef struct { 

24 PointBody xwitness.ywitness; 

25 }WitnessBox; 
26 

27 typedef struct { 

28 PointArray data; 

29 intfirstjast; 

30 int size; 

31 } QueueBody/Queue; 
32 

33 Queue MakeQueue(size) 

34 int size; 

35 { 

36 Queue q; 

37 if ((q = (Queue)calloc(1,sizeof(QueueBody)))= =NULL) { 

38 printf (" Cannot alloc space for queue body\n w ); 

39 exit(0); 

40 } 

41 if ((q- > data = (Po i ntArray) ca 1 1 oc(size l size of (PointBody))) = = NULL) { 

42 printf ("Cannot allocate space for queue array\n M ); 

43 exit(0); 

44 } 

45 q->first=q->last=0; 

46 q->size=size; 

47 return q; 

48 } 
49 

50 void lnsertPoint(x f y,q) 

51 intx,y; 

52 Queue q; 
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53 { 

54 q- > d a ta [q- > la st] . x = x; 

55 q->data[q->last].y=y; 

56 q->last=(q->last+1)%q->size; 

57 if (q->last==q->first){ 

58 printf("Maximum q size exceeded\n M ); 

59 exit(0); 

60 } 

61 } 
62 

63 void GetFirst(x,y,q) 

64 int*x,*y; 

65 Queue q; 

66 { 

67 if (q->first==q->last){ 

68 printf(" Error: tried top pop empty queue\n u ); 

69 exit(0); 

70 } 

71 *x = q- > data (q- > f irstj.x ; 

72 *y=q->datalq->first].y; 

73 q->fir$t={q->first+1)%q->5ize; 

74 } 
75 

76 BOOLEAN Empty(q) 

77 Queue q; 

78 { 

79 return q-> first = =q-> last; 

80 } • 
81 

82 void lnsertBlackNeighbors(Pitturepictjntx,inty,Queue queue) 

83 { 

84 if (ReadPixel(pict,x + 1,y)) { 

85 WritePixel(pict,x+1,y,WHITE); 

86 lnsertPoint(x + 1,y,queue); 

87 } 

88 if (ReadPixel(pict,x-1,y)){ 

89 WritePixel(pict,x-1,y,WHITE); 

90 lnsertPoint(x-1,y,queue); 

91 } 

92 if (ReadPixel(pict ( x,y + 1)) { 

93 WritePixel(pict,x,y-f 1 f WHITE); 

94 1nsertPoint(x,y+ 1,queue); 

95 } 

96 if (ReadPixel(pict,x,y-1)){ 

97 WritePixel(pict,x,yaWHITE); 

98 InsertPointCx.y-lequeue); 

99 } 

100 } 
101 

102 void PointFromTheta(theta,x,y) 

103 float theta; 

104 float *x,*y; 

105 { 

106 *x = cos(theta); 

107 *y = sin(theta); 
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108 } 
109 

110 void Normal(x,y,nx,ny) 

111 float x,y; 

112 float *nx,*ny; 

113 { 

1 14 *nx = -y; 

115 *ny = x; 

116 } 
117 

1 18 int DotFI(fx,fy,ix,iy) 

119 float fx.fy; 

120 intixjy; 

121 { 

122 return irint(fx*ix+fy*iy); 

123 } 
124 

125 static float pox,poy,pnx,pny; 
126 

127 void MinMax(boundingBox,oIdFrameBox,px,py) 

128 MinMaxBox *boundingBox; 

129 WitnessBox *o!dFrame8ox; 

130 intpx # py; 

131 { 

132 /* IGNORE THETA FOR THE TIME BEING*/ 

133 if (boundingBox->lrcx < DotFI(pox,poy,px,py)) { 

134 boundingBox->lrc.x = DotFI(pox,poy,px ( py); 

135 } 

1 36 if (boundingBox-> Ircy < DotFI{pnx,pny,px,py)) { 

137 boundingBox->lrc.y = DotFKpnx^ny^px^y); 

138 } 

139 if (boundingBox->ulc.x > DotFI(pox,poy,px,py)) { 

140 . boundingBox->ulc.x = DotFKpox.poy.p^py); 

141 oldFrameBox->xwitness.x = px; 

142 oldFrameBox->xwitness.y = py; 

143 } 

144 if (boundingBox->ulc.y > DotFI(pnx,pny,px,py)) { 

145 boundingBox->ulc.y = DotFI(pnx,pny,px f py); 

146 oldFrameBox->ywitness.x = px; 

147 oldFrameBox->ywitness.y = py; 

148 } 

149 } 
150 

151 /* Set the pixels on the border of the image to the color WHITE so that 

1 52 * the paint routine need never worry about going off the edge of the 

153 * image. */ 

1 54 void FramePicture(pict) 

155 Picture pict; 

156 { 

157 inti; 

1 58 for (i =s 0; i < pict-> height; + + i) { 

1 59 WritePixel(pirt,0,i,WHITE); 

160 WritePixel(pict,pict->width-1,i,WHITE); 

161 } 

162 for(i = 0;i<pict->width; + +i){ 
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1 63 Wr i tePixeKp ict,i,0 ( WH ITE) ; 

164 WritePixel(pictJ # pict->height-1,WHITE); 

165 } 

166 } 
167 

168 /* 

1 69 * Given as input a thresholded image, find the borders of the connected 

170 * components. Assumes image is thresholded toO and 1, 

171 */ 

172 void PaintComponent(pict,x ( y,queue,boundingBox,oldFrameBox) 

173 Picture pict; 

174 intx,y; 

175 Queue queue; 

176 MinMaxBox *boundingBox; 

1 77 WitnessBox *oldFr ameBox; 

178 { 

179 boundingBox->ulc.x = boundingBox->lrc.x = DotFI(pox,poy,x,y); 

180 boundingBox->ulc.y = boundingBox-Xrcy = DotFl(pnx,pny,x,y); 

181 oldFrameBox->xwitness.x = oldFrameBox->ywitness.x = x; 

182 oldFrarneBox->xwitness,y = oldFrameBox->ywitness.y = y; 
183 

184 lnsertPoint(x,y,queue); 

185 WritePixel(pict,x,y,WHITE); 

186 /* printfCQueue status: %s\n M ,(Ennpty{queue))? "empty": "not empty"); */ 

187 while (lEmpty(queue)) { 

188 GetFirst(&x,&y,queue); 

1 89 M inMax(boundingBox,oldFrameBox,x,y); 

190 lnsertBlackNeighbors(pict,x,y .queue); 

191 } 

192 } 
193 

194 intiabs(int x) 

195 { 

196 if(x<0) 

197 return -x; 

198 else 

199 return x; 

200 } 
201 

202 BOOLEAN PointlnBounds{Picture pict,intx,inty) 

203 { 

204 return x> = 0 && x< pict- > width && y> =0 && y< ptct-> height; 

205 } 
206 

207 

208 BOOLEAN BoxJnBounds(Picture pictjnt x, int y, int width, int height, 

209 double angle) 

210 { 

21 1 int rightX,rightY,downX,downY; 

212 rightX = width*cos(angle); 

213 rightY = width*sin(angle); 

2 1 4 do wnX = heig ht* cos(angle + MJN/2); 

215 downY= height*sin(angle + MJ>l/2); 

216 return (PointlnBounds(pict,x,y) && 

217 PointlnBounds(pict l x + rightX f y+rightY) && 
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218 PointlnBounds(pict,x-frightX+downX J y + rightY+downY>&& 

219 PointlnBounds(piet,x+downX,y+downY)); 

220 } 
221 

222 void GetCorner(WitnessBox *box,int *ulcx,int *ulcy) 

223 {. 

224 double c2; 

225 c2 = (-pny*(box->ywitness.x-box->xwitness.x) + 

226 pnx*(box->ywitness.y-box->xwitness.y) )/ 

227 (pox*pny - pnx*poy); 

228 *ulcx = c2*pox+box->ywitness.x; 

229 *ulcy = c2*poy+box->ywitness.y; 

230 } 
231 

232 List FindBorders<Picture plct,double theta) 

233 { 

234 int x,y; 

235 int ulcx,ulcy; 

236 Queue queue; 

237 MinMaxBox boundingBox; 

238 WitnessBox old Frame Box; 

239 List boxList; 

240 intwidth,height; 
241 

242 queue = MakeQueue(MAX_QUEUE_SIZE); 
243 

244 PointFromTheta(theta,&pox,&poy); 

245 Normal(pox,poy,&pnx,&pny); 
246 

247 printf( a Framing picture\n w ); 

248 FramePicture(pict); /* Put a "visited" color border 

249 * around the image */ 

250 boxList = nil; 

251 f or (y = 1 ;y < pict- > height-1 ; + +y) 

252 for (x= 1;x<pict->width-1; + +x) 

253 if (ReadPixel(pict,x,y)) { 

254 /* printf ("Found component at (°/6d,%d)\n" f x l y); */ 

255 PaintComponent(pict,x,y,queue,&boundingBox,&oldFrameBox); 

256 /* printf("Making box: %d %d %d %d\n M , 

257 otdFrameBox.ulcx, 

258 oldFrameBox.uk.y, 

259 oldFrameBox.lrc.x, 

260 oldFrameBox.lrc.y); 

261 */ 

262 GetCorner(&oidFrameBox,&ulcx,&ulcy); 

263 width = boundingBox.lrc.x-boundingBox.uk.x; 

264 height = boundlngBox.lrcy-boundingBox.uk.y; 

265 /* if(iabs(height)>10)V 

266 if (BoxlnBounds(pict # ulcx,ulcy, 

267 width,height,theta)) 

268 push(MakeBox(ulcx,ulcy f 

269 width,height,theta), 

270 boxList); 

271 } 

272 printf("Found %6 boxes completely on the page\n tt ,ListLength(boxList)); 
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273 return boxList; 

274 } 
275 

276 void DrawBox(Picture pict,Box box) 

277 { 

278 int rightX,rightY,downX,downY; 

279 rightX = box->width*cos(box->angle); 

280 rightY = box->width*sin(box-> angle); 

281 downX = box->height*cos(box->angIe+MJ>l/2); 

282 downY = box- >height*sin(box-> angle* (vTPI/2); 

283 /* printf("DrawBox: %d %d %d %d\n" ( box->x,box->y / box->width<,box->height); */ 

284 DrawLine(pict,box->x ( box->y l box->x+rightX,box->y+rightY # Oxff); 

285 DrawLine(pict,box->x+rightX ( box->y + rightY, 

286 box->x+rightX+downXj5ox->y+rightY + downY,0xff); 

287 Dra wLine(pict,box- > x + rig htX + downX,box- > y + rightY + downY, 

288 box->x+downX,box->y+downY,0xff); 

289 DrawLine(pict ( box->x+downX,box->y+downY l box->x l box->y,Oxff); 

290 } 
291 

292 void DrawColorBox(Picture pict,Box boxjnt color) 

293 { 

294 int rightX,rightY,downX,downY; 

295 rightX = box->width*cos(box-> angle); 

296 rightY = box->width*sin(box->angle); 

297 downX = box->height*cos(box->angle+M_PI/2); 

298 downY = box- >height*si n(box-> angle + mJpi/2); 

299 /* printffDrawBox: %d %d %d %d\n^ox->x,box->y,box->width<,box->height); */ 

300 DrawLine(pict,box->x,box->y < box->x + rightX,box->y + rightY,color); 

301 DrawLine(pict I box->x+rightX l box->y+rightY 1 

302 box->x+rightX+downX,box->y+rightY+downY,color); 

303 DrawLine(pict,box->x + rightX+downX,box->y+rightY + downY, 

304 box->x+downX,box->y + downY,color); 

305 DrawLine(pict,boX'>xH-downX,box->y+downY,box->x,box->y f color); 

306 } 
307 

308 

309 void DrawBoxList(Picture pict,List boxList) 

310 { 

31 1 while (lendp(boxList)) { 

312 DrawBox(ptct,(Box)pop(boxList)); 

313 } 

314 } 
315 

316 

317 #ifdef TRYMAIN 

318 /* WARNING - be sure to replace the height check in FindBorders */ 

319 #endif 

320 void main(argc,argv) 

321 intargc; 

322 char **argv; 

323 { 

324 char *infileName ( *outfileName; 

325 List boxList; 

326 intwidth,height; 

327 float theta; 
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328 
329 
330 
331 
332 
333 
334 
335 
336 
337 
338 
339 
340 
341 
342 
343 
344 
345 
346 
347 
348 
349 
350 



Picture pict,finalPict; 
FILE *outf ile; 

if(argc!=4){ 

printf( M Usage: %s infile outfile page_orientation\n° l argv[0]); 
exit(0); 

} 

infileName = argv[1]; 
outfileName = argv[2]; 
theta = atof(argv[3]); 

printf{ tt Loading %s. . ."/infileName); 
pict = load_pict(infileName); 

printf( M \nFinding boxesAn"); 

finalPict = new_pict(p ict-> width, pict- >height,p ict-> depth); 
/* CopyPicture(finalPict,pict); */ 
boxList = FindBorders(pict,theta); 

DrawBoxUst(finalPict,boxList); 
write pict(outfileName,finalPict); 

} 



10/24/2003, EAST version: 1.4.1 



5,491,760 
Section D 



388 

APPENDIX / Page 175 



Jan 16 15:52 1991 dict.c 

1 #include <stdio.h> 

2 #include "boolean.h" 

3 #include "types.h" 

4 #include"error.h" 

5 #indude"picth" 

6 #include"dict.h" 
7 

8 void WriteOutlinePair(OutlinePair o, FILE *fp) 

9 { 

10 fwrite(o->box,sizeof(BoxBody),1 ( fp); 

1 1 fwrite(&(o- > blackoutHeight),sizeof(f Joat), 1 ,fp); 

12 fwrite(&(o-> numberOfLegs) ( sizeof(int),1,fp); 

1 3 fwrite(&(o- > of f set),sizeof(mt), 1 ,f p); 

1 4 fwrite(&(o- > wid th),sizeof (int), 1 ,1 p); 
15 

16 fwrite(o->x,sizeof(float),o->numberOfLegs ( fp); 

17 fwriteto^top.sizeofCfloatJ.o^numberOfLeg^fp); 

18 fwrite(o->bottom,sizeof{float) l o->numberOfLeg5 / fp); 

19 } 
20 

21 void WriteDictionary(Dictionary diet, char 'filename) 

22 { 

23 FILE *fp; 

24 int temp; 

25 int i; 

26 if ((f p= f open(f ilename/ w M )) = = NULL) 

27 DoError("WriteDictionary: Error opening output fileAn", NULL); 

28 temp - 1234567; 

29 fwrite(&temp,sizeof(int) ( 1/fp); 

30 fwritetMdict^numberOfEntriesJ^izeofOntJJ.fp); 
31 

32 if (dict-> infoString = = NULL) { 

33 temp = 0; 

34 fwrite(&temp,sizeof(int),1,fp); 

35 } 

36 else{ 

37 temp = strlen(dict->infoString) + 1; 

38 fwrite(&temp,sizeof(int),1,fp); 

39 f write(dict- > infoString,sizeof(char),temp,f p); 

40 } 
41 

42 for{i = 0;i<dict->numberOfEntries; + +i) 

43 WriteOutlinePair(*(dict->outlines + i),fp); 

44 fclose(fp); 

45 } 
46 

47 

48 /* Reads a Box from a binary stream, the type Box is defined in box.h */ 

49 Box ReadBox(FILE *fp) 

50 { 

51 Box temp; 

52 temp = (Box)calloc(1,sizeof(BoxBody)); 
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53 if (temp == NULL) 

54 DoError("ReadBox: cannot allocate space\n",NULL); 

55 if (fread(temp,sizeof(BoxBody),1,fp)!= 1) 

56 DoError("ReadBox: error reading bounding box\n\NULL); 

57 return temp; 

58 } 
59 

60 /* Reads an OutlinePair from a binary stream. The format of an OutlinePair 

61 * follows: 

62 * BoxBody - shape bounding box 

63 * float - blackout bar height 

64 * int - number of legs in the contour 

65 * int - x coordinate of left edge of contour 

66 * int - width in pixels of edge contour 

67 * floatlnumberOfLegs] -x coordinates of contours 

68 * floatfnumberOfLegs] -y coordinates of top contour 

69 * floatfnumberOfLegs] -y coordinates of bototm contour 

70 */ 

71 OutlinePair ReadOutiinePair(FILE *fp) 

72 { 

73 OutlinePair temp; 

74 temp = (OutlinePair)calloc(1,sizeof(OutlinePairBody)); 

75 if(temp==NULL) 

76 DoErrorCReadOutlinePair: cannot allocate spa ce\n", NULL); 

77 temp->box = ReadBox(fp); 
78 

79 if (f read(&(temp- > biackoutHeight),sizeof (float), 1 ,f p) ! = 1) 

80 DoErrorCReadOutlinePair: error reading blackoutHeightW\NULL); 
81 

82 if (fread(&(temp->numberOfLegs) f sizeof{int) f 1 f fp)! = 1) 

83 DoErrorCReadOutlinePair: error reading length\n M ,NULL); 
84 

85 if (fread(&(temp->offset) I sizeof(int),1 ( fp)! = 1) 

86 DoErrorCReadOutlinePair: error reading off sett n", NULL); 

87 if(fread(&(temp->width),slzeof(int),1 f fp)! = 1) 

88 DoErrorCReadOutlinePair: error reading width\n",NULL); 
89 

90 temp->x = (float *)calloc(temp->numberOfLegs,sizeof (float)); 

91 if (temp- >x == NULL) 

92 DoErrorCReadOutlinePair: cannot allocate space\n",NULL); 

93 if (fread(temp->x, 

94 sizeoftfloatXtemponumberOfLegs^fpJ^temp^numberOfLegs) 

95 DoErrorCReadOutlinePair: error reading x coords\n*\NULL); 
96 

97 temp->top = (float *)calloc(temp->numberOfLegs,sizeof(float)); 

98 if (temp- > top = = NULL) 

99 DoErrorCReadOutlinePair: cannot allocate space\n",NULL); 

100 if (fread(temp->top,sizeof(float), 

101 temp- >numberOfLegs,fp)! = temp- >numberOf Legs) 

102 DoErrorCReadOutlinePair: error reading topY coords\n M ,NULL); 
103 

104 temp->bottom = (float *)calloc(temp-> numberOf Legs,sizeof{f loat)); 

105 if (temp- > bottom - = NULL) 

106 DoErrorCReadOutlinePair: cannot allocate space\n\NULL); 

1 07 if (f read(temp- > bottom, 
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108 sizeoftfioatjAemp^numberOfLegs^pJi^temp^nunnberOfLegs) 

109 DoErrorfReadOutlinePair: error reading bottomY coords\n*\NULL); 
110 

111 return temp; 

112 } 
113 

1 14 /* Create a new Dictionary structure with space allocated for the 

115 * entries. */ 

1 16 Dictionary NewDict(int numberOfEntries) 

117 { 

118 Dictionary temp; 

119 temp = (Dictionary)calloc(1,stzeof(DictionaryBody)); 

120 if (temp == NULL) 

121 DoErrorCNewDict: cannot allocate space\n w , NULL); 

122 temp- > numberOfEntries = numberOfEntries; 

123 temp->infoString = NULL; 

124 temp->rawOutlines = (RawOuttinePair *)calloc(numberOfEntries, 

1 25 sizeof (RawOutlinePair)); 

126 temp->outlines = (OutlinePair *)calloc(numberOfEntries, 

127 sizeof(OutlinePair)); 

128 if ((temp->outlines == NULL)[|(temp->rawOutlines = = NULL)) 

129 DoError( w NewDict: cannot allocate space\n" ( NULL); 

130 return temp; 

131 } 
132 

133 /* Read a dictionary from a binary format file. The file organization 

134 * follows: 

135 * int -number of entries in the dictionary 

136 * OutlinePair[numberOfEntries] - outlines of each shape in the dictionary 

137 * When a dictionary is read in, the shapes are sorted such that they fall 

138 * in the order of words on textlines. */ 

139 Dictionary ReadDictionary(char *filename) 

140 { 

141 FILE *fp; 

142 Dictionary diet; 

143 int i; 

144 int temp; 

145 int infoStringLength; 

146 int numberOfEntries; 

147 int magicNumber; 
148 

149 if ((fp = fopen(filename ( l, r M ))= = NULL) 

150 DoError("Error opening input file\n",NULL); 
151 

152 if (fread(&magicNumber,sizeof(int) # 1,fp)l =1) 

153 DoErrorCError reading dictionary\n'\NULL); 

154 if (magicNumber i = 1234567) 

155 DoErrorCReadDictionary: input file %s is not a dictionary fi!e.\n", 

156 filename); 
157 

158 if (fread(&numberOfEntries,sizeof(int),1,fp)! = 1) 

159 DoErrorC Error reading dictionary\n l, ,NULL); 

160 diet = NewDict(numberOf Entries); 
161 

162 if (freadt&infoStringLengtKsizeoffintJJ.fpJI^D 
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163 DoError( u Error reading dictionary\n u ,NULL); 

164 if (infoStringLength) { 

165 if ({dict->infoString = (char *)calloc(infoStringLength,sizeof(char))) = = 

166 NULL) 

167 DoErrorfReadDictionary: cannot allocate space for info string.\n u ,NULL); 

168 fread(dict->infoString # infoStringLength,si2eof(char),fp); 

169 *{dict->infoString+info5tringLength-1) = '\0'; /* Set last char to 0 just in case*/ 

170 } 
171 

1 72 for (i = 0; i < numberOf Entries; + + i) 

173 *(dict->outlines+i) = ReadOutlinePair(fp); 

174 fclose(fp); 

175 return diet; 

176 } 
177 

1 78 char *ArgListToString(int argc, char **argv) 

179 { 

180 inti; 

181 int total Length; 

182 char*theString; 

1 83 char *destCursor,*srcCursor; 
184 

1 85 f or (i = 0, totalLength = 0;i <argc; + + 0 

186 totalLength + = strlen(argvlij) + 1; /* Room for each arg and one space */ 

187 totalLength + +; /* Room for thee EOS character */ 
188 

189 if ((theString = (char *)calloc(tota]Length,sizeof (char))) = = NULL) 

190 DoErrorfArgListToString: cannotallocateespace.\n w ( NULL); 
191 

1 92 for (i = 0,destCursor= theString; i < argc; + + i) { 

193 srcCursor - argv[i]; 

194 while (*srcCursor!= '\0') 

195 *destCursor+ + = *srcCursor+ +; 

196 *destCursor + + = "; 

197 } 

198 *destCursor = '\0'; 
199 

200 return theString; 

201 } 
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1 #include <stdio.h> 

2 #indude <math.h> 

3 #include <values.h> 

4 #include ,, booiean.h" 

5 #include "type^n" 

6 #lnclude M pict.h M 

7 ^include "diff.h" 
8 

9 

10 void main(int argc,char **argv) 

11 { 

12 Picture pict; 

13 char*infile1/infile2 l *outfile; 
14 

15 if(argc!=4){ 

16 printf("Usage:\n"); 

17 printf(" %s infilel infile2 outfile\n M ,argv[0]); 

18 exit(-1); 

19 ] 
20 

21 infilel = argv[1J; 

22 infile2 = argv[2J; 

23 outfile = argv[3]; 

24 pict = CompareDictionaries(infile1,infile2); 

25 WritePictureAsAsciKpic^outfile); 



26 } 
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Jun21 15:541991 fft.c 



1 /* Copyright 1991 by Michael Hopcroft 

2 * Right is hearby granted to Xerox Corporation to make use of this 

3 * code free of charge. */ 

4 #indude <stdio.h> 

5 #indude <math.h> 

6 #indude "ffth u 
7 

8 /* Applies bit reversal permutation matrix to array a. length must be a power 

9 *of2.*/ 

1 0 void BitReverse(f loat *a, in t n) 

11 { 

12 inti,j,k; 

13 float temp; 
14 

15 j = t; 

16 for(i=1;i<n; + + i){ 

17 if(i<j){ 

18 temp = a[i-1]; 

19 a[i-1] = a[J-1]; 

20 a[j-1] = temp; 

21 } 

22 k = n/2; 

23 while (k<j){ 

24 j=j-k; 

25 k = k/2; 

26 } 

27 j = j + k; 

28 } 

29 } 
30 

31 #defineTWOPI(M Pl*2) 
32 

33 void fft(f loat *real,f loat *imagjnt logn,int mode) 

34 { 

35 int n; 

36 int j,top,i,id,bottom; 

37 int stage,subpartLength; 

38 float tempr # tempi / temp2r,temp2i l ar,ai,wr # wi,angle; 
39 

40 n = irint(exp2({double)logn)); 
41 

42 for (stage= 1, subpartLength = n; 

43 stage< = logn; 

44 + +stage, subpartLength/ = 2) { 

45 angle = TWOPI/subpartLength; 

46 ar=1.0; 

47 ai = 0.0; 

48 if (mode = = REVERSE) { 

49 wr = cos(angle); 

50 wi = sin(angle); 

51 }else{ 

52 wr = cos(angle); 
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53 wi = -sin(angle); 

54 } 

55 for (j = 0;j <subpartLength/2; + +j) { /* for each offset in a part */ 

56 for (top=j;top<n;top+ = subpartLength){ /* for each part*/ 

57 bottom = top+subpartLength/2; 

58 tempr = real[bottom]; /* temp = x[idj */ 

59 tempi = imag[bottomJ; 

60 real(bottom] = reai[topJ-rea I (bottom]; /* x[id] = x[i] - x[id) */ 

61 imag[bottom] = imag[topl-imag[bottom]; 

62 temp2r = real[bottom)*ar-imag[bottorn]*ai; /* temp2 = x[id]*a */ 

63 temp2i — real[bottom)*ai + imag[bottorn]*ar; 

64 reallbottom] = temp2r; /* x(rd] = temp2 */ 

65 imaglbottom] = temp2i; 

66 reai[top] + = tempr; /* x[i] + =temp */ 

67 imag[top] + = tempi; 

68 } 

69 temp2r = ar*wr-ai*wi; /* a * = w */ 

70 temp2i = ai*wr+ar*wi; 

71 ar = temp2r; 

72 ai = temp2i; 

73 } 

74 ) 

75 BitReverse(real,n); 

76 BitReverse(imag,n); 
77 

78 #ifdeffoo 

79 if (mode == MAGNHUDE) 

80 for(i = 0;i<n;++i) 

81 real[i] = sqrt(real{i]*realli] + imag[i]*imag[i]); 

82 #endif 
83 

84 if (mode = = MAGNITUDE) 

85 for(i = 0;i<n;++i) 

86 real[i] = sqrt(real[i]*real(i] + imag[i]*imag[i]); 

87 } 
88 

89 #ifdef TRYMAIN 

90 void main(int argc f char **argv) 

91 { 

92 #definePOWER8 

93 #def ine LENGTH 256 

94 float real[LENGTH]; 

95 float imag[LENGTH]; 

96 int i; 

97 #ifdef foo 

98 f or (i = 0; i < LENGTH; + + i){ 

99 if (i< LENGTH/2) 

100 real[i] = 1.0; 

101 else 

102 real[i] = 0.0; 

103 imag[i] = 0.0; 

104 } 

105 #endif 
106 

107 for(i = 0;i<LENGTH; + + i){ 
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108 realli] = sin(8*TWOPI*i/(lENGTH-1)); 

109 imagli] = 0.0; 

110 } 

1 1 1 fft(real,imag,POWER,MAGNITUDE); 

112 for (i = 0; i < LENGTH; + + i) 

1 13 printf( tt %d %f\n\i,real[i]); 

114 } 

115 #endif 
116 
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1 #include <stdio.h> 

2 #include <math.h> 

3 #include"boolean.h" 

4 #include "types.h" 

5 #include "error.h" 

6 #indude "picth" 

7 #include "dicth" 

8 #include M fontNorm.h H 
9 

10 

11 extern double ceil(double); 

12 extern int irint(double); 
13 

14 

15 #defineUP0 

16 #defineDOWN1 

17 typedef int Direction; 
18 

19 extern Picture thePict; 
20 

21 void StoreRawOutlinePair(Dietionary diet, int dictEntry, 

22 Box box.int *bothX,int *topY, int *baseY, 

23 int numberOfLegs) 

24 { 

25 RawOutlinePairtemp; 

26 int t; 

27 int *xCursor,*topCursor,*bottomCursor; 
28 

29 temp = (RawOutlinePair)calloc(1,$i2eof(RawOutlinePairBody)); 

30 if (temp NULL) 

31 DoError( M StoreRawOutlinePair: cannot allocate space\n\NULL); 
32 

33 temp- > box = box; 

34 temp- > numberOfLegs = numberOfLegs; 
35 

36 temp->x = (int *)calloc(temp->numberOfLegs,sizeof(int)); 

37 temp->top = (int *)calloc(temp->numberOfLegs,sizeof(int)); 

38 temp- > bottom = (int *)calloc(temp->numberOfLeg$,sizeof(int)); 

39 if ((temp- >x == NULL) || 

40 (temp- > top = = NULL) || 

41 (temp- > bottom = = NULL)) 

42 DoError("StoreRawOutlinePair; cannot allocate space\n",NULL); 
43 

44 xCursor = temp->x; 

45 topCursor = temp->top; 

46 bottomCursor = temp- > bottom; 
47 

48 f o r (i == 0; i < nu mberOf Legs; + + i) { 

49 *xCursor+ + = *bothX+ + ; 

50 * topCu rso r + + = * top Y + + ; 

51 *bottomCursor+ + = *baseY+ +; 

52 } 
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53 *(dict->rawOutlines+dictEntry) = temp; 

54 } 
55 

56 int RawOutlineWidth(RawOutlinePair a,mt middleLine) 

57 { 

58 int i,numberOf Legs,right,left; 

59 int *topCursor,*bottomCursor; 

60 int topValue.bottomValue; 
61 

62 numberOfLegs = a- > numberOfLegs; 
63 

64 topCursor = a->top; 

65 bottomCursor = a-> bottom; 

66 for(i=0;i<numberOfLegs; + + i) { 

67 topValue = *topCursor+ +; 

68 bottomValue = *bottomCursor+ +; 
69 

70 if (topValue ! = HITJTHEJJOX) { 

71 topValue = middleLine -topValue; 

72 if (topValue <0) 

73 topValue = 0; 

74 } 

75 else 

76 topValue = 0; 
77 

78 if (bottomValue I = HITTHE.BOX) { 

79 bottomValue = bottomValue - middleLine; 

80 if (bottomValue < 0) 

81 bottomValue = 0; 

82 } 

83 else 

84 bottomValue - 0; 
85 

86 if ((bottomValue I = 0)||(topValue i = 0)) 

87 break; 

88 } 

89 left = i; 
90 

91 topCursor = a- > top + numberOfLegs- 1; 

92 bottomCursor = a- > bottom + numberOfLegs- 1; 

93 f or (i = numberOf Legs-1 ; i> = 0;~i) { 

94 topValue = *topCursor--; 

95 bottomValue = *bottomCursor--; 
96 

97 if (topValue i= HIT_THE_BOX) { 

98 topValue = middleLine - topValue; 

99 if (topValue<0) 

100 topValue = 0; 

101 } 

102 else 

103 topValue = 0; 
104 

105 if (bottomValue I = HITTHE_BOX) { 

106 bottomValue = bottomValue - middleLine; 

107 if (bottomValue < 0) 
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108 bottomValue = 0; 

109 } 

110 else bottomValue = 0; 
111 

112 If ((topValue I = 0)||(bottomValue I = 0)) 

113 break; 

114 } 

115 right = i-fl; 
116 

1 17 return right-left; 

118 } 
119 

120 void ResampleOutlinePair(Out]inePair afloat newToOld Factor) 

121 /* Resample an outline pair using linear interpolation, */ 

122 { 

1 23 int newWidth,oldWidth,i; 

124 intoldLeft,oldRight; 

125 float oldCenter; 

1 26 float *newX,*newTop,*newBottom; 

1 27 float *xCursor,*topCursor ( *bottomCursor; 
128 

129 oldWidth = a->numberOfLegs; 

130 newWidth = irint(newToOldFactor*old Width); 
131 

132 newX = (float *)calloc(newWidth,sizeof (float)); 

133 newTop = (float *)ca!loc(newWidth,sizeof (float)); 

134 newBottom = (float *)calloc(newWidth,sizeof (float)); 

135 if((newX==NULL)||(newTop==NULL)||(newBottom==NULL)) 

136 DoError( w ResampleOutlinePair: cannot allocate space.\n w ,NULL); 
137 

138 xCursor = newX; 

139 topCursor = newTop; 

140 bottomCursor = newBottom; 
141 

1 42 for (i = 0; I < newWidth; + + i) { 

143 oldCenter = i/(float)newWidth*(float)oldWidth; 

144 oldLeft = irint(floor(oldCenter)); 

145 oldRight = irint(ceil(oldCenter)); 

146 if(o!dl_eft==oldRight){ 

147 *xCursor++ = *(a->x+ oldLeft); 

148 *topCursor+ + = *(a- > top + old Left); 

149 *bottomCursor+ + = *(a-> bottom* oldLeft); 

150 } 

151 else{ 

152 float slope; 

153 slope = *(a->x+oldRight)-*(a->x+oldLeft); 

154 *xCursor+ + = *(a->x+ oldLeft) + (oldCenter-oldLeft)*slope; 

155 slope = *(a->top + oldRight)-*(a->top+oldLeft); 

156 *topCursor+ + = *(a-> top + oldLeft) + (oldCenter-oldLeft)*slope; 

157 slope = *(a->bottom + oldRight)-*(a->bottom + oldLeft); 

158 *bottomCursor+ + = *(a->bottom + oldLeft) + (oldCenter-otdLeft)*slope; 

159 } 

160 } 
161 

162 free(a->x); 
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163 free(a->top); 

164 free(a-> bottom); 
165 

166 a->x = newX; 

167 a->top = newTop; 

168 a->bottom = newBottom; 

169 a->numberOflegs = newWidth; 

170 } 
171 

172 void 5toreOutlinePair(Dictionary diet, int dictEntry, 

173 IntmiddleLineJntfontXHeight, 

174 intascenderHeight,NormalizationDescriptor *nd) 

175 /* This routine normalizes the raw outline pair stored in diet at dictEntry using the following 

176 * operations: 

177 * 1) For the top contour, shift so that the middle line isaty=0 and negate so that the 

178 * higher points are greater than 0. For the bottom, shift so that middle line is aty=0, 

179 * but don't flip. Thus, lower points have y coordinates greater than 0. 

180 * Consider points whose value is HIT_THE_BOX to be at y=0. These correspond to gaps 

181 * between the letters. 

182 * 2) Compress top and bottom y coordinates by 1/fontXHeight so that the coordinates at 
the 

183 * distance of the fontXHeight have value 1. Note that 1 isan arbitrary number. It is 

184 * unlikely that a signal will have parts that are the x height above the center line 

185 * anyway. 

186 * FOR TOP CONTOUR, 

187 * IF HEIGHT IS GREATER THAN XHEIGHT, SCALE DIFFERENCE BY 1 . 5/ASCEND ER_H E IG HT. 

188 * ELSE SCALE DIFFERENCE BY 1/XHEIGHT. 

1 89 * FOR BOTTOM CONTOUR, 

190 * SCALE BY 1.5/ASCENDER.HEIGHT. 

191 * 3)Compressthexcoordinatesbythesamefactorasinstep2. Note that this does not 

192 * actually resample the contour. NOW DO THIS WITH RESAMPLE. USE SCALE FACTOROF 

193 * 20/XHEIGHT. 

194 * 4) Remove left and right ends of the contour that have y values of zero. This is so the 

195 * contour starts where the word starts, rather than at the edge of its bouding box. 

196 * 5) Resample the contour to stretch byfirstFontXwidth/fontxWidth. KILL THIS 
OPERATION. 

197 */ 

198 { 

199 RawOutlinePair raw; 

200 OutlinePairtemp; 

201 int i,numberOfLegs; 

202 inty; 

203 int offset; 

204 int *xSCursor,*topSCursor,*bottomSCursor; 

205 float *xDCursor,*topDCursor,*bottomDCursor; 

206 float *xCursor,*topCursor/bottomCursor; 

207 intleft,right; 

208 float foffset; 

209 float ascenderFactor,xHeightFactor,widthFactor; 
210 

211 raw = *(dict->rawOutlines+ dictEntry); 
212 

213 temp = (OutlinePair)ca!loc(1,sizeof(OutlinePairBody)); 

214 if (temp == NULL) 

215 DoError("StoreOutlinePair: cannot allocate space\n",NULL); 
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216 

217 temp->x = (float *)calloc(raw->numberOfLegs,sizeof(fioat)); 

218 temp->top = (float *)calloc(raw->numberOfLegs,sizeof(float)); 

219 temp- > bottom = (float *)calloc(raw->numberOfLegs,sizeof(float)); 

220 if((temp->x= = NULL) || 

221 (temp->top==NULL)|| 

222 (temp->bottom = = NULL)) 

223 DoError("StoreOutlinePair: cannot allocate space\n",NULL); 
224 

225 temp->box = raw->box; 

226 temp->blackoutHeight = 0; 

227 temp- >numberOf Legs = raw- >numberOf Legs; 

228 offset = temp->offset = *{raw->x); 

229 temp->width = *(raw->x+raw->numberOfLegs-1)-temp->offset; 
230 

231 xDCursor = temp->x; 

232 topDCursor = temp->top; 

233 bottomD Cursor = temp->bottom; 

234 xSCursor = raw->x; 

235 topSCursor = raw->top; 

236 bottomSCursor = raw- > bottom; 
237 

238 ascenderFactor = 1.5/ascenderHeight; 

239 xHeightFaaor = 1.0/fontXHeight; 

240 widthFactor = 20.0/fontXHeight; 

241 if (nd->noXHeightNormalize) { 

242 xHeightFactor = 1.0; 

243 ascenderFactor = 1.0; 

244 } 

245 if (nd-> noAscenderNormalize) 

246 ascenderFactor = xHeightFactor; 
247 

248 numberOfLegs = raw->numberOfLegs; 

249 f or (i = 0; i < nu mberOf Legs; -f + i) { 

250 if (*topSCursor= = HIT THE BOX) { 

251 y = 0; 

252 topSCursor+ +; 

253 } 

254 else { 

255 y = middleLine - *top5Cursor+ +; 

256 if(y<0) 

257 y = 0; 

258 } 

259 if(y>fontXHeight/2){ 

260 float tempi = (float)y * ascenderFactor; 

261 float temp2 = (float)fontXHeight/2 * xHeightFactor; 

262 if (temp1<temp2) 

263 *topDCursor++ = temp2; 

264 else 

265 *topDCursor++ = tempi; 

266 /* 

267 *topDCursor-f + = (f loat)y * ascenderFactor; 

268 */ 

269 } 

270 else 
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27 1 *topDCursor+ + = (f loat)y * xHeightFactor; 
272 

273 if (*bottomSCursor= = HIT THE BOX) { 

274 y = 0; 

275 bottomSCu rsor + + ; 

276 } 

277 else{ 

278 y = *bottomSCursor+ + - middleLine; 

279 if(y<0) 

280 y = 0; 

281 } 

282 if (y<fontXHeight/2) 

283 *bottomDCursor+ + = (float)y * xHeightFactor; 

284 else{ 

285 float tempi = (float)y * ascenderFactor; 

286 float temp2 = (f loat)fontXHeight/2 * xHeightFactor; 

287 if (tempKtemp2) 

288 *bottomDCursor+ + = temp2; 

289 else 

290 *bottomDCursor+ + = tempi; 

291 /* *bottomDCursor+ + = (float)y * ascenderFactor; */ 

292 } 

293 } 
294 

295 /* Now try to remove parts of the contour on to the left and right of the 

296 * word shape that are at height 0 */ 
297 

298 /* Find left edge*/ 

299 topDCursor = temp->top; 

300 bottomDCursor = temp- > bottom; 

301 for (i-0;i<numberOf Legs; + + i){ 

302 if ((*topDCursor++ I = 0)||{*bottomDCursor + + ! =0)) 

303 break; 

304 } 

305 left = i; 
306 

307 /* Find right edge*/ 

308 topDCursor = temp- > top + numberOfLegs-1; 

309 bottomDCursor = temp->bottom + numberOfLegs-1; 

310 for(i=numberOfLegs-1;i> = 0;~i){ 

311 if ((*topDCursor- ! = 0)||(* bottomDCursor- 1 = 0)) 

312 break; 

313 } 

314 right = 
315 

316 /* Clip the ends of the contour at left and right */ 

317 xDCu rsor = temp- >x; 

318 topDCursor = temp->top; 

319 bottomDCursor = temp- > bottom; 

320 xCursor = temp->x+left; 

321 topCursor = te mp-> top + left; 

322 bottomCursor = temp->bottom+left; 

323 foffset = *xSCursor; 

324 for (i = left; i< right; + +i) { 

325 *xDCursor+ + = *xCursor+ + -foffset; 
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326 *topDCursor+ + = *topCursor+ +; 

327 *bottomDCursor+ + = *bottomCursor+ + ; 

328 } 

329 temp- >numberOf Legs = right-left; 
330 

331 *(dict->outlines + dictEntry) = temp; 

332 ResampleOutlinePair(*(dict->outlines+dictEntry),widthFactor); 

333 } 
334 

335 static int lineSpacing; 

336 int OrderOutlinePair(OutlinePair *o1,OutlinePair *o2) 

337 { 

338 int yDistance; 

339 int xDistance; 

340 yDistance = <*o1)->box->pageY- (*o2)->box->pageY; 

341 if (yDistance < lineSpacing && yDistance > -lineSpacing) { 

342 xDistance = (*oD->box->pageX- (*o2)->box->pageX; 

343 return xDistance; 

344 } 

345 return yDistance; 

346 } 
347 

348 void SortDictionary(Dictionary diet) 

349 { 

350 lineSpacing = 20; 

351 qsort(dict->rawOutline$ l dict->numberOfEntries,sizeof(RawOutlinePair), 

352 OrderOutlinePair); 

353 } 
354 

355 #def ine HIST.SIZE 100 

356 void HistogramMax(int *data f int datalengthjnt offset,int signjnt *histogram) 

357 { 

358 -inti,bin; 
359 

360 if(sign>0){ 

361 int maxValue; 
362 

363 maxValue = *data; 

364 for (i = 0;i<dataLength; + + i) 

365 if (data[i]! = HIT_THE_BOX) { 

366 maxValue = data[i]; 

367 break; 

368 } 

369 for(;i<dataLength;+ +i) 

370 if (data[i] I = HlT_THE_BOX && data[i]> maxValue) 

371 maxValue = data[i]; 

372 if (maxValue I = HIT_THE_BOX){ 

373 bin = maxValue-offset; 

374 if {(bin> =0)&&(bin< HIST_SIZE)) 

375 histogram[bin]+ +; 

376 } 

377 } 

378 else { 

379 int minValue; 

380 minValue = *data; 
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381 for (i = 0; i < data Length; + + i) 

382 if (data[i]l = HIT.THE.BOX) { 

383 minValue = datafi]; 

384 break; 

385 } 

386 for(;i<dataLength; + +i) 

387 if (data[i]l = HIT.THE.BOX && data[ij< minValue) 

388 minValue = dataji]; 

389 if (minValue != HITTHE_BOX){ 

390 bin = minValue-offset; 

39 1 if ((bin > = 0)&&(bin < HIST.SIZE)) 

392 histogram[bin]+ + ; 

393 } 

394 } 

395 } 
396 

397 void Histogram(int *data,int dataLength, int offset, in t * histogram) 

398 { 

399 int i,bin; 
400 

40 1 f o r (i = 0; i < dataLength; + + i) { 

402 if (*data ! = HIT_THE_BOX) { 

403 bin - *data-offset; 

404 if((bin>=0)&&(bin<HIST_SIZE)) 

405 histogram[bin]++; 

406 } 

407 data++; 

408 } 

409 } 
410 

41 1 int MaxBin(int *histogram) 

412 { 

413 inti; 

414 intmaxValue; 

415 intmaxlndex; 
416 

417 maxValue = histogram^; 

418 maxlndex = 0; 

419 f o r (i = 0; i < HIST.SIZE; + + i) 

420 if (histogram[i]> maxValue) { 

421 maxValue = histogram[i]; 

422 maxlndex = i; 

423 ) 

424 return maxlndex; 

425 } 
426 

427 int MaxBinAbove(int *histogram ( int line) 

428 { 

429 inti; 

430 intmaxValue; 

431 intmaxlndex; 

432 inttop.bottom; 
433 

434 for (i = 0; i < HIST.SIZE; + + i) 

435 if (histogram[i]T= 0) 
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436 break; 
437 

438 top = i; 

439 bottom = (Iine+top)/2; 
440 

441 maxValue = histogram (top); 

442 maxlndex = top; 

443 for (i =top;i<= bottom; + 

444 if (histogram[i]>maxValue) { 

445 maxValue = histogram!!]; 

446 maxlndex = i; 

447 } 

448 return maxlndex; 

449 } 
450 

451 void DrawTextLines(Picture thePict.Dictionary dictjnt topLine,int bottomLine) 

452 { 

453 int maxLength; 

454 int halfWidth; 

455 intx,y; 

456 float X2,x3,y2,y3; 

457 float angle; 
458 

459 angle = (*(dict-> raw/Outlines))- > box- > angle; 

460 maxLength = thePict->width + thePict-> height; 

461 halfWidth = thePict-> width / 2; 

462 x = topLine * -sin(angle) + halfWidth * cos(angle); 

463 y = topLine * cos(angle) + halfWidth * sin(angle); 

464 x2 = x+maxLength*cos(angle); 

465 y2 = y + maxLength*sin(angle); 

466 x3 = x-maxLength*cos(angle); 

467 y3 = y-maxLength*$in(angle); 

468 DrawLine(thePict,x,y,(int)x2,(int)y2,5); 

469 DrawLine(thePict,x ( y,(int)x3,(int)y3,5); 
470 

471 x = bottomLine * -sin(angle) + halfWidth * cos(angle); 

472 y = bottomLine * cos(angle) + halfWidth * sin(angle); 

473 x2 = x + maxLength*cos(angle); 

474 y2 = y + maxLength*sin(angle); 

475 x3 = x-maxLength*cos(angle); 

476 y3 = y-maxLength*sin(angle); 

477 DrawLine(thePict / x,y,(int)x2 / (int)y2,5); 

478 DrawLine(thePict,x,y f (int)x3,(int)y3 # 5); 

479 } 
480 

481 void Page5tatistics(Dictionary dict,char *fileName t NormalizationDescriptor *nd) 

482 /* WARNING - this must be run before PostProcess since PostProcess changes the raw 

483 * shape data. */ 

484 { 

485 int index; 

486 int temp; 

487 int i,startlndex,firstY,minY,endlndex,shape; 

488 inttops|HIST,SIZE); 

489 intbottoms[HIST SIZE]; 

490 intascenders[HIST SIZE]; 



10/24/2003, EAST version: 1.4.1 



5,491,760 



421 



422 



Section D 



APPENDIX / Page 192 



491 
492 
493 
494 
495 
496 
497 
498 
499 
500 
501 
502 
503 
504 
505 
506 
507 
508 
509 
510 
511 
512 
513 
514 
515 
516 
517 
518 
519 
520 
521 
522 
523 
524 
525 
526 
527 
528 
529 
530 
531 
532 
533 
534 
535 
536 
537 
536 
539 
540 
541 
542 
543 
544 
545 



int descenders[HISTJI2E]; 

int middleLine^topLine^bottomLine^ascenderLine^escenderLine; 
int ascenderHeight,descenderHeightJineNurnber; 
intfontXHeight/fontXWidth,xlndex; 
RawOutlinePair thisShape; 
FILE *fp; 

BOOLEAN haveFirstFontXWidth = FALSE; 
intfirstFontXWidth; 

if ((fp=fopen(f ileName,"w°))= = NULL) 

DoErrorCPageStatistics: error opening output file %s.\n tt ,fileName); 
SortDictionary(dict); 

index = 0; 
#ifdeffoo 

malloc verifyO; 
#endif 

lineNumber = 0; 

while (index < dict->numberOf Entries) { 
startlndex = index; 

firstY = (*(dict->rawOutlines+index))->box->pageY; 
minY = firstY; 

while ((*(dict->rawOutlines-Hndex))->box->pageY- firstY < 20 && 

(*(dkt->rawOutlines+index))->box->pageY-firstY > -20) { 
if (minY > ((*(dict->rawOutlines+index))->box->pageY)) 

minY = (*(dict->rawOutlines + index))- > box- > page Y; 
+ + index; 

if (index == diet- >numberOf Entries) 
break; 

} 

endlndex = index; 

#ifdeffoo 

malloc.verifyO; 
#endif 

/* shapes from start index through endindex are all on */ 
/* the same text line */ 

/* minY has the top of the highest box on the line. */ 

/* Find the base and toplines by taking the mode of the heights of the 

* valleys of the bottom contours and the peaks of the top contours */ 
for(i=0;i<HlST_SIZE;i++){ 

tops[i) = 0; 

bottoms[i] = 0; 

ascenders[i] = 0; 

descenders[i] = 0; 

} 

for(shape = startlndex;shape<endlndex; + +shape) { 
thisShape = *(dict-> rawOutlines+ shape); 

Histogram(thisShape->top,thisShape->numberOfLegs 1 minY,tops); 
Histogram(thisShape->bottom # thisShape->numberOfLegs,minY i bottoms); 
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546 HistogramMax(thisShape->top,thisShape->numberOfLegs,minY # -1,ascenders); 

547 Hi$togramMax(thisShape->bonom,thisShape->numberOfLegs ( minY,1,descenders); 

548 } 

549 topLine = MaxBin(tops) + minY; 

550 bottomLine = MaxBin(bottoms)+ minY; 

551 ascenderLine = MaxBin(ascenders) + minY; 

552 descenderline = MaxBin(descenders) + minY; 
553 

554 if (thePict) 

555 DrawTextLines(thePict,dict,topLine,bottomLine); 

556 #ifdef foo 

557 malloc verifyO; 

558 #endif " 
559 

560 middleLine = (bottomLine+topLine)/2; 

561 fontXHeight = bottomLine-topLine; 

562 ascenderHeight = bottomLine-ascenderLine; 

563 if «float)ascenderHeight/(float)fontXHeight < 1.1) { 

564 fprintf(stderr, M Bad ascender height on line °/od.\n", lineNumber); 

565 ascenderLine = MaxBinAbove(a$cenders,ascenderLine-minY) + minY; 

566 ascenderHeight = bottomLine-ascenderLine; 

567 fprintf(stderr,"New ascender height = %d.\nNewxheight = 
%d.\n",ascenderHeight / fontXHeight); 

568 } 
569 

570 fprintf(fp,"%d: °/od %d %2.6f\n\lineNumber,fontXHeight,ascenderHeight, 

571 (float)ascenderHeight/(float)fontXHeight); 
572 

573 #ifdeffoo 

574 /* Assume that the first shape in the image is the letter x. 

575 * Use this shape to compute the fontXWidth value. */ 

576 if (lineNumber = =0) 

577 fontXWidth = RawOutlineWidth(*(dict->rawOutlines) # middleLine); 

578 #endif 
579 

580 ++ lineNumber; 

581 if (fontXHeight < 0) { 

582 fprintf(stderr, M PageStatistics: negative fontXHeight in line %d.\n\lineNumber); 

583 fontXHeight *=-1; 

584 } 

585 for (shape = startlndex;shape<endlndex; + + shape) 

586 StoreOutlinePairfdict^shapcmiddleLine.fontXHeightascenderHeightnd); 

587 } /* Do another line of text */ 

588 fclose(fp); 

589 } 



10/24/2003, EAST version: 1.4.1 



425 

Section D 



5,491,760 



426 

APPENDIX / Page 194 



Jan 12 17:35 1991 getAII.c 



1 #include <stdio.h> 

2 #include <math.h> 

3 #include <values.h> 

4 #include "boolean.h" 

5 #include"types.h M 

6 #include "picth" 

7 #indude*dict.h" 
8 

9 #def ine MAX_STRING LEN256 
10 

1 1 void WriteShiftedAsdiOutline(FILE *fp, OutlinePair outline, float x, float y) 

12 { 

13 inti; 

14 for(i = 0;i<outline->numberOfLegs; + +i) 

15 fpnntf(fp,"%f %f\n a ,i+x,*<outline->top+i)+y); 

16 fprintfttp/rtop^n 11 ); 
17 

18 for(i=0;i<outline->numberOfLegs; + -H) 

1 9 f printf(f p. n %f %f\n + x,y-(* (outline- > bottom + i))); 

20 fprintftfp/V'bottorMnVn"); 

21 } 
22 

23 void WriteOutlines(char *filename,Dictionary diet) 

24 { 

25 float maxWidth,maxHeight; 

26 int i,j,count; 

27 int width, height; 

28 float x,y; 

29 OutlinePair outline; 

30 FILE *fp; 

31 if ((fp = fopen(filename, tt w"))= = NULL){ 

32 printf("Error opening %s. w ,filename); 

33 exit(-l); 

34 } 
35 

36 maxWidth = 0; 

37 maxHeight = 0; 

38 for (i = 0;i<ditt->numberOfEntries; + +i) { 

39 outline = *(dict-> outlines + i); 

40 if (outline->numberOfLegs > maxWidth) 

41 maxWidth = outline->numberOfLegs; 

42 for (j = 0;j <outline- > numberOfLegs; + + j) { 

43 if (*(outline->bottom+j)>maxHeight) 

44 maxHeight = *(outline-> bottom +j)> maxHeight; 

45 if (*(outline->top+j)> maxHeight) 

46 maxHeight = *{outline-> bottom +j)> maxHeight; 

47 } 

48 }; 
49 

50 printf( M maxWidth,maxHeight = %f,%f\n\maxWidth,maxHeight); 
51 

52 width = irint(sqrt((dpuble)(dict- >numberOf Entries))); 
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53 height = irint((doub!e)(dict-> numberOf Entries) / width); 
54 

55 printf("n, width, height = %d,%d,%d\n M ,dict-> numberOf Entries,width, height); 
56 

57 for(i = 0;i<height; + +i) 

58 for (j = 0;j < width; + + j) { 

59 count = i*width+j; 

60 if ((count < 16) && (count < diet- > numberOf Entries)) { 

61 x = j*maxWidth*1.5; 

62 y = (height-i+ 1)*maxHeight*3; 

63 printf("(%f^of) ", X( y); 

64 WriteShiftedAsciiOutiine(fp ( *(dict->outlines + count),x ( y); 

65 } 

66 } 

67 fclose(fp); 

68 } 
69 

70 

71 void main(int argc,char **argv) 

72 { 

73 char *infile,*outfile; 

74 Dictionary diet; 
75 

76 if(argc!=3){ 

77 printf( w Usage:\n M ); 

78 printf( M %s infile outfile\n",argv[0]); 

79 exit(-1); 

80 } 
81 

82 infile = argv[1]; 

83 outfile = argv[2J; 

84 diet = ReadDictionary(infiie); 
85 

86 WriteOutlines(outfile,dict); 
87 

88 printfOn"); 

89 } 
90 

91 
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Jul 8 14:25 1991 getOutline.c 



1 #include <stdio.h> 

2 #include <math.h> 

3 #include <values.h> 

4 #include <strings.h> 

5 #include "booleaah" 

6 #include "types-h" 

7 #include "pict.h" 

8 #include "dict.h" 
9 

10 extern char *$trchr(char *$,int c); 
11 

12 #defineMAX STRING LEN256 
13 

14 void WriteAsciiOutline(char filename, OutlinePair outline) 

15 { 

16 FILE *fp; 

17 inti; 

18 if ((fp = fopen(filename/w"))= =NULL) { 

19 printfC Error opening %s. \filename); 

20 exit(-1); 

21 } 

22 for(i=0;i<outline->numberOfLegs; + +i) 

23 fprintf(fp,"%d %f\n*\i,*(outline->top + i)); 

24 fp^intf(fp; , \ u top\n\n ,, ); 
25 

26 for(i=0;i<outline->numberOfLegs; + +i) 

27 fprintf(fp/%d %f\n M (outlined bottom + 

28 fprintf(fp/Vbotto^n\n\n ,, ); 

29 fclose(fp); 

30 } 
31 

32 

33 void main(int argc,char **argv) 

34 { 

35 char*infi!e; 

36 cha r s[MAX_STRING_LEN],o utf il e[MAX_STRING_LEN] ; 

37 Dictionary diet- 
SB int selection; 

39 char*crPointer; 

40 BOOLEAN done = FALSE; 
41 

42 if(argc!=2){ 

43 printf( tt Usage:\n H ); 

44 printfC %$ infile\n'\argv[0l); 

45 exit(-l); 

46 } 
47 

48 infile = argv(1]; 

49 dirt = ReadDictionary(infile); 
50 

51 while (!done){ 

52 printffShape number [0..%d]: tt ,dict->numberOfEntries-1); 
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53 fgets(s,MAX_STRING_LEN,stdin); 

54 if (sscanf(s/'%d w ,&seTection)=: = 1){ 

55 if (selection < 0 [| selection > = d ict- > numberOf Entries) 

56 printfCShape numbers must be between 0 and %d, indusiveAn", 

57 diet- > numberOf Entries-1); 

58 else{ 

59 printffOutputfile: M ); 

60 fgets(outfite,MAX_STRINGJ£N,stdin); 

61 crPointer = strchr(outfi!e/\n'); 

62 if (crPointer ! = NULL) 

63 *crPointer = '\0'; 

64 printf( M Writing shape %d to file %s\n",selection,outfile); 

65 WriteAsci iOutline(outf ile,*(dict- > outlines + selection)); 

66 } 

67 } 

68 else if ((s[0] = = 'VO 1 ) || (s[0] = = '\n')) 

69 done = TRUE; 

70 else{ 

71 printf("Enteran integer to select a shape or a blank line\n ,l >; 

72 printfC'toquitAn"); 

73 } 

74 ) 

75 } 
76 

77 
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Jan 11 17:061991 guassian.c 

1 #include <stdio.h> 

2 #include <math.h> 

3 #include <values.h> 
4 

5 float square(floatx) 

6 { 

7 return x*x; 
B } 

9 

10 float gaussian(a, s, x) /* return A*GAUSS(SIGMA, X) */ 

11 floata,s,x; 

12 { 

13 return (a*exp(-square(x/s)/2.0))/(s*sqrt(2.0*M P!)); 

14 } 
15 

16 float *MakeMask(int halfMaskSize, float a) 

17 { 

18 int mask_size; 

19 intx; 

20 floats; 

21 float *mask, sum; 
22 

23 mask_size = 2* halfMaskSize* 1; 

24 - s = halfMaskSize/2; 

25 mask = (float *) calloc(halfMaskSize+ 1, sizeof (float)); 

26 if (mask == NULL) { 

27 printf("MakeMask: cannot allocate space\n"); 

28 exit(-l); 

29 } 
30 

31 for(x-0;x< = halfMaskSize; x+ + ) { 

32 mask[x] = gaussian(a, s, (float) x); 

33 /* printfC'/oeNn-.masklx]);*/ 

34 } 
35 

36 for (sum = fabs(mask[0]), x = 1; x < = halfMaskSize; x++) 

37 sum + = 2.0*fabs(mask[x]); 
38 

39 for(x = 0; x < = halfMaskSize; x++) 

40 mask[x] /= sum; 
41 

42 return mask; 

43 } 
44 

45 void Guass1DF!oat(float *data f int n, int halfMaskSize) 

46 { 

47 float a; 

48 float *mask; 

49 float *newData; 

50 float MeftPtr/rightPtr; 

51 float sum; 

52 inti,j,!eft,right; 
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53 

54 a=1; 
55 

56 if (n < halfMaskSize*2 + 1) 

57 return; 
58 

59 newData = (float *)calloc(n f sizeof(float)); 

60 if (newData == NULL){ 

61 printf( H Guass1DFIoat: cannot allocate space\n"); 

62 exit(-l); 

63 } 
64 

65 mask = MakeMask(halfMaskSize,a); 
66 

67 for(i = halfMaskSize;i<n-halfMaskSize; + 

68 sum = *(data + i) * ma$k[0]; 

69 leftPtr = rightPtr = data + i; 

70 f or (j = 1 ; j < half MaskSize; + + j) 

71 sum + = masklj] * (*{-leftPtr) + *(+ + rightPtr)); 

72 newData[i) = sum; 

73 } 
74 

75 for (i = 0; i < ha If MaskSize; + + i) { 

76 sum = data(i]*mask[0]; 

77 left = i; 

78 right = i; 

79 for Q = 1 ;j < half MaskSize; + + j) { 

80 if (-left < 0) 

81 left+=n; 

82 if (+ 4- right > = n) 

83 right • = n; 

84 sum + = mask[j] * ( data(left) + data[rightl ); 

85 } 

86 newData[i) = sum; 

87 } 
88 

89 for (i = n-half MaskSize; i < n; -f + i) { 

90 sum = data[i]*mask[0]; 

91 left = i; 

92 right = i; 

93 for 0 = 1;j < half MaskSize; + + j) { 

94 if (.-left < 0) 

95 left+=n; 

96 if (++ right > = n) 

97 right - = n; 

98 sum + = masklj] * ( data[left] + datafright] ); 

99 } 

100 newData[i] = sum; 

101 } 
102 

103 leftPtr = data; 

104 rightPtr = newData; 

105 for(i = 0;i<n; + +i) 

106 *leftPtr+ + = *rightPtr+ +; 

107 free(newData); 
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Aug 23 19:21 1991 lines.c 



1 #include <stdio.h> 

2 #include <values.h> 

3 #include <math.h> 

4 #include "boolean.h" 

5 #include "pict.h* 

6 #include "lines.h" 
7 

8 void LineEngine(Picture pict, 

9 intxl, 

10 intyl, 

11 intx2, 

12 intyl 

13 UCHAR color, 

14 pistPunc PerPixel) 

15 { 

16 static int inside = 0; 

17 int xinc,yinc; 

18 int distance; 

19 int left,right,top,bottom; 
20 

21 + +inside; 

22 left = 0; 

23 right = pkt->width-1; 

24 top - 0; 

25 bottom = pict->height-1; 

26 /* printf( M Drawiine: (%d f %d)-(%d # %d)\n\x1,y1 f x2 r y2); */ 

27 /* CASE VERTICAL */ 

28 yinc = y2-y1; 

29 xinc = x2-x1; 

30 if (xinc > 0) { 

31 if (yinc > 0) { 

32 /* Line goes up to the right */ 

33 if (yinc>xinc) 

34 distance = -yinc; 

35 else 

36 distance = xinc; 

37 while ((*PerPixel)(pict,x1,y1, 

38 ((x1 <x2)||(y1 <y2))&&(xK=right)&&(y1< : = bottom), 

39 color)) { 

40 if (distance > 0){ 

41 /* move right */ 

42 x1 + + ; 

43 distance -= yinc; 

44 }else{ 

45 /* move up*/ 

46 y1 + + ; 

47 distance + = xinc; 

48 } 

49 } 

50 } else { 

51 if (-yinoxinc) 

52 distance = yinc; 
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53 else 

54 distance = xinc; 

55 while ((*PerPixel)(pict,x1,y1, 

56 <(x1 < x2) || (yl > y2))&&{xl < =right)&&(y1 > =top), 

57 color)) { 

58 if (distance >0){ 

59 /* move right*/ 

60 x1 + + ; 

61 distance += yinc; 

62 }else{ 

63 /* move down */ 

64 yl-; 

65 distance + = xinc; 

66 } 

67 } 

68 } 

69 }else{ 

70 if(yinc>0){ 

71 /* Line goes up to the left */ 

72 if (yinO-xinc) 

73 distance = -yinc; 

74 else 

75 distance = -xinc; 

76 while ((*PerPixel)(pict,x1,y1, 

77 ((x1 > x2) || (yl < y2))&&(x1 > = Ieft)&*(y1 < = bottom), 

78 color)) { 

79 if(distance>0){ 

80 /* move left */ 

81 x1-; 

82 distance -= yinc; 

83 }else{ 

84 /* move up */ 

85 y1 + +; 

86 distance -= xinc; 

87 } 

88 } 

89 }else{ 

90 if (-yino-xinc) 

91 distance = yinc; 

92 else 

93 distance = -xinc; 

94 while «*PerPixel){pict,x1,y1, 

95 ((x1 >x2)||(y1 >y2))&&(x1> = left)&&(y1>=top), 

96 color)) { 

97 if (distance > 0) { 

98 /* move left */ 

99 x1~; 

100 distance + = yinc; 

101 }else{ 

102 /* move down*/ 

103 y1~; 

104 distance -= xinc; 

105 } 

106 } 

107 } 
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108 } 

109 -inside; 

110 } 
111 

112 BOOLEAN DrawPiston(Picture pict, int x, inty, BOOLEAN test, UCHAR color) 

113 { 

114 if (test) 

1 1 5 WriteClippedPixel(pict,x,y,color); 

116 return test; 

117 } 
118 

1 19 static UCHAR bitmasks[J = { 0x80,0x40,0x20,0x10,0x8,0x4,0x2,0x1 }; 
120 

1 21 void CountLine! Bit(Picture pict, 

122 intxl, 

123 intyl, 

124 intx2, 

125 inty2, 

126 int*totalSet, 

127 int* total) 

128 { 

129 static int inside = 0; 

130 int xinc,yi nc; 

131 int distance; 

132 int left, right,top .bottom; 
133 

134 intuchar width; 

135 UCHAR * cursor; 

136 UCHAR mask; 

137 int count = 0; 

138 int pixels = 0; 
139 

140 + + inside; 

141 left=0; 

142 right = pict- > width- 1; 

143 top = 0; 

144 bottom - pict->height-1; 
145 

146 if (pict->depth != 1) 

147 DoError( M CountLine1Bit: Only depth 1 is supported.\n\NULL); 
148 

149 uchar_width = pict->uchar_width; 

150 cursor = pict- > data +y1*uchar_width+(x1 > >3); 

151 mask = bitmasks[x1%8]; 
152 

153 /* printf("Drawline: (%d//od)-(%d,%d)\n M ,x1,y1,x2,y2); */ 

154 /* CASE VERTICAL*/ 

155 yinc = y2-y1; 

156 xinc = x2-x1; 

157 if(xinc>0){ 

158 if(yinc>0){ 

159 /* Line goes up to the right */ 

160 if(yinOxinc) 

161 distance = -yinc; 

162 else 
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163 distance = xinc; 

164 while(((x1 <x2)||(y1<y2))&&(x1< = right)&&(yK=bottom)){ 

1 65 if (*cursor & mask) 

166 ++ count; 

167 + + pixels; 

168 if (distance >0){ 

169 /* move right*/ 

170 if (mask == 0x1) { 

171 mask = 0x80; 

172 + +cursor; 

173 } 

174 else 

175 mask = mask >> 1; 

176 x1H- + ; 

177 distance -= yinc; 

178 }else{ 

179 /* move up*/ 

180 cursor + = uchar width; 

181 y1 + + ; 

182 distance + = xinc; 

183 } 

184 } 

185 }else{ 

186 if (-yinoxinc) 

187 distance = yinc; 

188 else 

189 distance = xinc; 

190 while (((xl <x2)||(y1 > y2))&&(xK = right)&&(y1 > =top)) { 

191 if (*cursor & mask) 

192 ++ count; 

193 + + pixels; 

194 if {distance > 0){ 

195 /* move right*/ 

196 if (mask == 0x1) { 

197 mask = 0x80; 

198 ++ cursor; 

199 } 

200 else 

201 mask = mask >> 1; 

202 x1 + +; 

203 distance + = yinc; 

204 }else{ 

205 /* move down*/ 

206 cursor -= uchar width; 

207 y1~; 

208 distance + = xinc; 

209 ) 

210 } 

211 } 

212 }else{ 

213 if(yinc>0){ 

214 /* Line goes up to the left*/ 

215 if (yinc>-xinc) 

216 distance = -yinc; 

217 else 
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218 distance = -xinc; 

219 while(((x1 >x2)||(y1 < y2))&&(x1> = left)M(y1< = bottom)) { 

220 if (*cursor&mask) 

221 + + count; 

222 ++ pixels; 

223 if (distance > 0) { 

224 /* move left */ 

225 if (mask = = 0x80) { 

226 mask = 0x1; 

227 -cursor; 

228 } 

229 else 

230 mask = mask << 1; 

231 x1«; 

232 distance -= yinc; 

233 }else{ 

234 /* move up*/ 

235 cursor + = uchar width; 

236 y1 + + ; 

237 distance -= xinc; 

238 } 

239 } 

240 } else { 

241 if (-yinc>-xinc) 

242 distance = yinc; 

243 else 

244 distance = -xinc; 

245 while (((xl > x2)||(y1 > y2))&&(x1 > = left) &&(y1 >= top)) { 

246 if (*cursor & mask) 

247 + + count; 

248 ++ pixels; 

249 if (distance > 0) { 

250 /* move left */ 

251 if (mask == 0x80) { 

252 mask = 0x1; 

253 -cursor; 

254 } 

255 else 

256 mask = mask << 1; 

257 x1~; 

258 distance + = yinc; 

259 } else { 

260 /* move down */ 

261 cursor -= uchar_width; 

262 y1- 

263 distance -= xinc; 

264 } 

265 } 

266 } 

267 ) 

268 -inside; 

269 *total5et + = count; 

270 *total + = pixels; 

271 J 
272 
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273 void DrawLine(Picture pict ( int xl, int yl, iht x2, int y2, UCHAR color) 

274 { 

275 LineEngine(pict l x1,y1 ( x2 ( y2,color ( DrawPiston); 

276 } 
277 

278 static int pixelCounter; 

279 static int setCounter; 

280 BOOLEAN CountPiston(Picture pict, intx, int y, BOOLEAN test, UCHAR color) 

281 { 

282 if (test) { 

283 + + pixelCou nter; 

284 if (ReadPixel(pict,x,y)) 

285 + +setCounter; 

286 } 

287 return test; 

288 } 
289 

290 #ifdeffoo 

291 float CountLine(Picture pict, intxl, int yl, intx2 f int y2) 

292 { 

293 pixelCounter = 0; 

294 setCounter = 0; 

295 LineEngine(pict,x1,y1,x2,y2,0,CountPiston); 

296 LineEngineCpiax^y^xl^-xl^yl^-ylJACountPiston); 

297 return (float)setCounter/pixelCounter; 

298 } 

299 #endif 
300 

301 float CountLine(Picture pict, int xl, int yl, int x2, int y2) 

302 { 

303 pixelCounter = 0; 

304 setCounter = 0; 

305 CountLine1Bit(pict,x1,y1,x2,y2 # &setCounter,&pixelCounter); 

306 CountLinelBitfpict.xl.ylAl-txZ-xlJ^I^-yl^&setCounteo&pixelCounter); 

307 return (f loat)setCounter/pixelCounter; 

308 } 
309 

310 static int startx; 

311 staticintstarty; 

312 static int endx; 

313 static int endy; 

314 BOOLEAN DistancePiston(Picture pict, intx, int y f BOOLEAN test, UCHAR color) 

315 { 

316 if (test) { 

317 if (ReadPixel(pict,x ( y)) { 

318 if ((x = = startx)& & (y = = sta rty)) 

319 return test; 

320 else { 

321 endx = x; 

322 endy = y; 

323 return FALSE; 

324 } 

325 } 

326 else 

327 return test; 
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328 }else 

329 return test; 

330 } 
331 

332 

333 int DistanceLine(Picture pict, intxl, intyl, int x2, int y2) 

334 { 

335 double dx,dy; 

336 startx = xl; 

337 starty = y1; 

338 endx = x2; 

339 endy = y2; 

340 LineEngine(pict,x1,y1,x2,y2,0,DistancePiston); 

341 dx = endx-x1; 

342 dy = endy-y1; 

343 return sqrt(dx*dx+dy*dy); 

344 } 
345 

346 

347 #ifdefT£ST 

348 void draw(pict) 

349 Picture pict; 

350 { 

351 float angle; 

352 float step; 

353 float x1,y1,x2,y2; 

354 float M,r2; 

355 intxc,yc; 
356 

357 xc = 320; 

358 yc = 250; 

359 r1 =s 50; 

360 r2 = 400; 

361 step = M PI*2/50; 
362 

363 for (angle = 0;angle < 2*M J>l; angle + = step) { 

364 x1 = xc-H r1*cos(angle); 

365 y1 = yc + r1*sin(angle); 

366 x2 = xc + r2*cos(angle); 

367 y2 = yc + r2*sin(angle); 

368 DrawLine(pirt,(int)x1 ( (int)y1,(int)x2 # (int)y2,0xff); 

369 printf("%3.2f: %d %d\n M ,angle, 

370 CountLine(pict J (int)x1,(int)y1,(int)x2 f (int)y2), 

371 DistanceLine(pict f (int)x1 f {int)y1 ( (int)x2 f (int)y2»; 

372 } 

373 ) 
374 

375 void main(argcargv) 

376 intargc; 

377 char**argv; 

378 { 

379 char*outfile; 

380 Picture pict; 
381 

382 if (argcJ= 2) { 
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383 printfC Usage: %s outfi!e\n n ,argv[0]); 

384 exit(0); 

385 ) 

386 outfiie = argvlU; 
387 

388 pict = new _pict(640,5Q0,1); 
389 

390 draw(pict); 
391 

392 write_pict(outfile,pict); 

393 printf("done\n u ); 

394 } 

395 #endif 
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Aug 23 16:43 1991 maxFilter.c 

1 #include <stdio.h> 

2 #include"mylib.h" 
3 

4 extern int irint(double); 
5 

6 #define MAX_SIGNAL_LENGTH (10000) 

7 #def ine MIN_MODE (5) /* MIN MODE must be less than MAX HIST SIZE */ 

8 #def ine MAX_HIST SIZE (500) 

9 #define MAX_PEAKS (100) 

10 #define BASE.PERCENTILE (0.5) 

1 1 float data [MAX_SIGNAL_LENGTH1 ; 

12 intnewSignal[MAX_SIGNAL LENGTH); 
13 

14 int MaxOnlnterval(int startjnt end) 

15 { 

16 inti; 

17 float maxValue ~ data[start]; 

18 intmaxlndex = start; 

19 for (i = start; i< end; + +i) 

20 if (data[ij> maxValue) { 

21 maxValue = data[i]; 

22 maxlndex = i; 

23 } 

24 return maxlndex; 

25 } 
26 

27 void main(int argc,char **argv) 

28 { 

29 char*infile,*outfile; 

30 FILE *inFP,*outFP; 

31 int signalLength; 

32 float *cursor; 

33 int foo; 

34 int i; 

35 intmaskWidth = 10; 

36 float maxValue; 

37 int maxlndex,modeValue,modelndex; 

38 inth!MAX_HIST.SI2E]; 

39 intfinalCount; 

40 intfinallndex[MAX_PEAKS]; 

41 float baseThresh; 

42 BOOLEAN upState; 

43 float thisRatioJastRatio; 
44 

45 DefArg("%s %s Vinfile outfile",&infile,&outfile); 

46 ScanArgs(argc,argv); 
47 

48 if((inFP=fopen(infile/ , r"))==NULL) 

49 DoError("Error opening file %s.\n M ,infile); 
50 

51 cursor = data; 

52 while(fscanf(inFP l ,, %d%f\n M ,&foo / cursor++)==2) 
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53 if (cursor-data > MAX_SJGNALJ.ENGTH) 

54 DoError("Signal is too long.W.NULL); 

55 signalLength = cursor-data; 
56 

57 /* Compute the threhold for the black edge to black pixel ratio */ 

58 maxValue = data[0]; 

59 for(i=0;i<signalLength;++i){ 

60 if (data[ij> maxValue) 

61 maxValue = data[i]; 

62 } 

63 baseThresh = maxValue*BASE_PERCENTILE; 

64 printf( H baseThresh = %3.3f\n",baseThresh); 
65 

66 /* Get the indices of the peaks taller than baseThresh */ 

67 finalCount = 0; 

68 upState = TRUE; 

69 for(i=0;i<signalLength; + +i){ 

70 thisRatio = data[i]; 

71 if (thisRatio < baseThresh) 

72 thisRatio = 0; 

73 if (upstate) { 

74 if (thisRatio < lastRatio) { 

75 finallndex[finalCount] = i; 

76 finalCount + + ; 

77 upState = FALSE; 

78 } 

79 } 

80 else{ 

81 /* upState = = FALSE */ 

82 if (thisRatio > lastRatio) 

83 upState = TRUE; 

84 } 

85 lastRatio = thisRatio; 

86 if (finalCount = = MAX.PEAKS) 

87 break; 

88 } 
89 

90 /* Histogram the distances between adjacent peaks */ 

91 for(i=0;i<MAX_HISTSIZE;h[i+ +] = 0); 

92 for(i=0;i<finalCount-1;++i){ 

93 intd; 

94 d = finallndex[i + 1]-finallndex[i]; 

95 if(d<MAX HIST SIZE) 

96 h[d]++; 

97 } 
98 

99 /* Find the mode of the adjacent distances that is above MIN_MODE */ 

100 modeValue = h[MIN_MODEl; 

101 modelndex = MIN MODE; 

102 for(i=MIN MODE; i< MAX HIST SIZE; + + i) 

103 if (h[i]> modeValue) { 

104 modeValue = h[i]; 

105 modelndex = i; 

106 ) 
107 
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108 /* Set the mask width to half of the most common spacing of largest peaks */ 

109 maskWidth = irint(modelndex*0.80); 

1 10 printf( M maskWidth = %d.\n", maskWidth); 
111 

112 for(i=0;i<signalLength;newSignal(i++] = 0); 

113 f or (i = 0; i < signalLength-maskWidth; + -fi) 

114 newSignal[MaxOnlnterval(M+ maskWidth)] + +; 
115 

1 16 if ((outFP=fopen(outfile,"w"))= =NULL) 

1 17 DoError("Error opening file %s.\n",NULL); 

118 for(i = 0;i<signalLength; + + i) 

1 19 fprintf(outFP,"%d %d\n\i,newSignai[i]); 

120 fclose(outFP); 

121 } 
122 

123 
124 
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Jun 19 21:22 1991 myWcc 

1 #indude <stdio.h> 

2 #include "boolean.h" 

3 #include "error.h" 
4 

5 typedef int State; 

6 #def ine WHITE SPACE 0 

7 #def ine UNKNOWN WORD 1 

8 #def ine ASCENDER WORD 2 
9 

10 

1 1 #def ine MAX STRING LENGTH 200 
12 

13 BOOLEAN isWhite(char c) 

14 { 

15 return (c= ="||c= ='\t'||c= ='\0'[|c= ='\n*); 

16 } 
17 

18 BOOLEAN isAscender(char c) 

19 { 

20 return ((c= ='b')||(c= ='d')II(c= = T)H(c== ='h')|)(c= = V)||(c= ='j')ll(c= = 'k')||(c= =T)|| 

21 (c= = 't')||«c> ='A-)&&(c< =r'2'))i|((c> = '0')&&(c< = '9'))ll(c= = '\")||(c= = "")); 

22 } 
23 

24 void main(int argc,char **argv) 

25 { 

26 char *filename; 

27 FILE *fp; 

28 chars[MAX_STRING_LENGTH + 1]; 

29 char*ptr; 

30 State state; 

31 int wordsWithAscenders,wordsWithoutAscenders, words; 
32 

33 if(argcl=2){ 

34 fprintf(stderr, M Usage:\n H ); 

35 fprintf(stderr/ %s <inputfile>\n ,i ); 

36 exit<-1); 

37 } 
38 

39 filename = argv[1]; 

40 if ((fp=fopen(filename,V'))= =NULL) 

41 DoError("%s: cannot open input f ileAnVilename); 
42 

43 wbrdsWithAscenders = 0; 

44 wordsWithoutAscenders = 0; 

45 words = 0; 

46 fgets(s f MAX_STRING LENGTH,fp); 

47 while ( If eof(fp)){ 

48 ptr = s; 

49 state = WHITE JPACE; 

50 while (*ptr! = 'VO^f 

51 switch (state) { 

52 caseWHITEJPACE: 
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53 if (isWhite(*ptr)) 

54 ++ptr; 

55 else 

56 state = UNKNOWN WORD; 

57 break; 

58 case UNKNOWN WORD: 

59 if (isWhite(*ptr)){ 

60 + +wordsWithoutAscenders; 

61 + + words; 

62 state = WHITE SPACE; 

63 } 

64 if (isAscender(*ptr)) { 

65 + +wordsWithAscenders; 

66 -H-words; 

67 ++ptr; 

68 state = ASCENDER_WORD; 

69 } 

70 else 

71 ++ptr; 

72 break; 

73 case ASCEND ER_WORD: 

74 if (isWhite(*ptr)) 

75 state = WHITE.SPACE; 

76 + + ptr; 

77 break; 

78 default: 

79 DoError("myWc: interna! error - bad state.\n'\NULL); 

80 } /* switch */ 

81 }/* while (*ptr...*/ 

82 fgets(s,MAX STRING LENGTH.fp); 

83 }/* while (leof ...V 

S4 printf( "words: %d\n M ,words); 

85 printf ( M words with ascenders: °/fld\n", words With Ascenders); 

86 printf(" words without ascenders: %d\n",wordsWithoutAscenders); 

87 printf( M word ascender/descender ratio: %6.2f\n", 

88 (float)wordsWithAscenders/(float)wordsWithoutAscenders); 

89 } 
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Aug 23 18:12 1991 newBaselines.c 

1 #include <stdio.h> 

2 #include <values.h> 

3 #include <math.h> 

4 #include a boolean.h" 

5 #include "picth" 

6 #include "types.h" 

7 #include"lists.h M 

8 #include"lines.h" 

9 #include"baselines.h ,i 
10 

1 1 extern double sqrt(double); 

1 2 extern int irint(double); 
13 

14 pnline*/ int NewReadPixei(UCHAR *base,int width/float x,float y) 

16 intxi; 

17 intyi; 

18 UCHARmask; 
19 

20 xi = irint(x); 

21 yi = irint(y); 

22 mask = 0x80 > > (xi & 0x7); 

23 return *(base+yi*width+(xi>>3)) & mask; 

24 } 
25 

26 void NewCountLine1Bit(Plcture pictjnt x1,inty1,int x2,inty2,int *black,int *b!ackEdge) 

28 float x,y; 

29 float xincyinc; 

30 float xupincyupinc; 

31 float den; 

32 intb.be; 

33 intwidth,ucharWidth,height; 

34 UCHAR*data; 
35 

36 width = pict-> width; 

37 ucharWidth = pict->uchar_width; 

38 height = pict-> height; 

39 data = pict->data; 
40 

41 den = Sqrt((y2-y1)*(y2-y1) + (x2-x1)*(x2-x1)); 

42 xinc = (x2-x1)/den; 

43 yinc = (y2-y1)/den; 

44 xupinc = -yinc; 

45 yupinc = xinc; 

46 x = x1; 

47 y = y1; 
48 

49 b=0; 

50 be = 0; 
51 



52 while (x<width&&x> = 0&&y<height&&y>=0) { 
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53 -f+b; 

54 if (NewReadPixel(data,ucharWidth,x,y)) { 

55 ^(KNewReadPixe[(data,ucharWidth < x+xupinc # y+yupinc)&& 

56 NewReadPixel(data,ucharWidth,x-xupinc,y-yupinc))) 

57 + +be; 

58 } 

59 x + = xinc; 

60 y + = yinc; 
61 

62 } 

63 *black = b; 

64 *blackEdge = be; 

65 } 
66 

67 

68 #def ine MIN_BLACK 5 

69 void NewCountLine(Picture pictjnt x1,int y1,int x2,int y2 f int *black,int *blackEdge) 

71 *black = 0; 

72 *blackEdge = 0; 

73 NewCountLinelBitCpicUl.yl^^blac^blackEdge); 

74 ^ NewCountLinel Bit(pict,x1,y1,x1-<x2-x1),y Hy2-y1),black,blackEdge); 

76 

77 static float x2offset; 

78 static float y2offset; 

79 static int projectlndex; 

80 static int *blackPixels; 

81 static int *blackEdgePixels; 

82 static int *coordx; 

83 static int *coordy; 

84 BOOLEAN Basel_inePiston(Picture pict, int x, int y, BOOLEAN test, UCHAR color) 

85 { 

86 if (test) { 

87 NewCountLine(pict l x J y,(int)(x+x2offset),(int)(y+y2offset) / 

88 blackPixels+projectlndex,b!ackEdgePixels + projectlndex); 

89 coordx[projectlndexj = x; 

90 coo r d y[p roj ect I ndex + + ] = y; 

91 returntest; 

92 } else 

93 return test; 

94 } 
95 

96 static int lastX; 

97 static int lastY; 

98 BOOLEAN EndPointPiston(Picture pict, int x f int y, BOOLEAN test, UCHAR color) 

99 { 

100 if (test) { 

101 lastX = x; 

102 lastY = y; 

103 } 

104 returntest; 

105 ) 
106 

107 void EndPoints(Picture pict,double anglejnt *tx, int *ty,int *bx, int *by) 
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108 { 

109 intxcyc; 

110 int maxLength; 

111 float normal; 

112 float x2,y2,x3,y3; 
113 

1 14 /* Make normal to text point in quadrants I and II */ 

115 /* Assume 0 < = angle < 2*M J>1 */ 

116 normal = fmod(angle + M PI/2,2*M PI); 

117 if(normal>MJ>J) 

118 normal -= M PI; 
119 

120 xc= pict->width/2; 

121 yc= pict-> height/2; 
122 

123 maxLength = pict-> width +pict-> height; 

124 x2 = xc+maxLength*cos<normal); /* At bottom of picture */ 

125 y2 = yc+maxLength*sin(normal); 

126 x3 = xc-maxLength*cos(normal); /* At top of picture */ 

127 y3 = yc-maxLength*sin(normal); 
128 

129 LineEngine(pict,xc # yc f (int)x2 f (int)y2,0,EndPointPiston); 

130 *bx = lastX; 

131 *by = lastY; 

132 LineEnginetpict^x^yc^intJxS^intJyS^^EndPointPiston); 

133 *tx=lastX; 

134 *ty = lastY; 

135 } 
136 

1 37 double distance(int x1 ,int y1 ,int x2,int y2) 

138 { 

139 return sqrt((double)((x1-x2)*(x1-x2)+(y1-y2)*(y1-y2))); 

140 } 
141 

142 FILE *PlotBaselineContour(char *plotFile,int topCount, 

143 float *ratios # int *newSignal, 

144 float baseThresh) 

145 { 

146 FILE *outfile; 

147 int i; 
148 

149 printf{"Opening baselines plot file\n"); 

150 if ((outfile = fopen(plotFile ( "w M ))= = NULL) { 

151 printf("Error opening baseline plot fileAn"); 

152 exit(-1); 

153 } 

1 54 f or (i = 0; i < topCount; + + i) 

1 55 fprintf(outfile ( w %d %f\n\i,ratios[i]/baseThresh*5); 

156 fprintf(outfile/Y'Ratio\n\n"); 

1 57 f or (i = 0; i < topCount; + + i) 

158 fprintffoutfile.^/ad %d\n",i ( newSignal[i]); 

159 fprintf(outfile/\"Projection\n\n w ); 

160 fprintf(outfile, 

161 "0%f\n%d%f%\n\ H BaselineThreshold\n\ 

1 62 baseThresh,topCount,baseThresh); 
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163 return outf lie; 

164 } 
165 

1 66 int MaxOnlnterval(float *data ,int start,int end) 

167 { 

168 int i; 

1 69 float maxValue = data[start); 

1 70 int maxlndex = start; 

171 for (i = start; i < end; + + i) 

1 72 if (data[i] > maxValue) { 

173 maxValue = data[i]; 

174 maxlndex = i; 

175 } 

1 76 return maxlndex; 

177 } 
178 

179 #defineBA5E PERCENTILE 0.50 

180 #define MIN_LINE_HEIGHT_FRACT10N 0.50 

181 #define MINJvlODE (5) /* MINJvlODE must be less than MAX HIST SIZE */ 

1 82 #def ine MAX,HIST SIZE (500) 

183 #def ine MAX_BASELINES (300) 

1 84 List BaseLines(Picture pict f double angle,char *plotFile) 

185 { 

186 float *topProjection; 

187 int*topCoordx,*topCoordy; 

188 int*finalCoordx,*finalCoordy,*finallndex; 

189 inttoplndex,bottomlndex; 

190 int topCount,botCount,finalCount; 

191 int maxLength; 

192 intxcyc; 

193 float x2,y2,x3,y3; 

1 94 float maxValueJastValue; 

195 intij; 

196 float baseThresh; 

1 97 int topX,topY ( bottomX,bottomY; 

1 98 BOOLEAN onTextLine; 

199 ListxList,yList,result; 

200 double totalDistance,averageDistance; 

201 FILE *outfile; 

202 int inside; 

203 BOOLEAN upState; 

204 float ratioJastRatio.tKisRatio; 

205 float *ratios; 

206 int*newSignal; 

207 int half MaskWidth = 10; /* for computing ratios*/ 

208 int maxlndex,modeValue ( modelndex; 

209 int h[MAX_HIST_SIZE]; 

210 int maskWidth; 7* for max filter */ 
211 

212 printffangle = %3.3f\n\angle); 
213 

214 /* The longest ling though the picture will be shorter than maxLength */ 

215 maxLength = pict-> width +pict-> height; 
216 

217 /* Allocate space for the page projection values */ 
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218 blackPixels = (int *)calloc(maxLength,sizeof(int)); 

219 blackEdgePixels = (int *)calloc(maxLength,sizeof(int)); 

220 ratios = (float *)calloc(maxLength,sizeof(float)); 

221 newSignal = {int*)calloc(maxLength,sizeof(int)); 

222 topCoordx = (int *)calloc(maxLength ( sizeof(int»; 

223 topCoordy = (int *)calloc(maxLength,sizeof(int)); 

224 finalCoordx = (int *)calloc(maxLength,sizeof(int)); 

225 finalCoordy = (int *)calloc(maxLength,sizeof(int)); 

226 finailndex = (int *)calloc(maxLength ( sizeof(int)); 
227 

228 if ((blackPixels = = NULL)|| 

229 (blackEdgePixels = = NULL))| 

230 (ratios = = NULL)|| 

231 (newSignal ==NULL)|| 

232 (topCoordx = = NU LL)|| 

233 (topCoordy = = NULL)|| 

234 (finailndex == NULL) || 

235 (finalCoordx == NULL) || 

236 (finalCoordy = = NULL)) { 

237 printf(" Baselines: cannot allocate memory\n u ); 

238 exit(-1); 

239 } 
240 

241 /* Compute the endpoints of a line through the center of the picture in the direction 

242 * perpendicular to the text lines. This line will be used as the reference frame for 

243 * computing projections. */ 

244 EndPoints(pict l angle,&topX / &topY,&bottomX l &bottomY); 
245 

246 printf( M Main Line: (%d//odH%d ( %d)\n H ( topX l topY,bottomX f bottomY); 

247 /* DrawLine(pict f topX l topY f bottomX l bottomY,Oxff); */ 
246 

249 /* Compute the projection of the image at each point along the line. 

250 * topProjection will have the number of black pixels on a line and 

251 * ratios will have the fraction of black pixels on a line that are 

252 * the ends of vertical extents. */ 

253 x2offset = maxLength*cos(angle); 

254 y2offset = maxLength*sin(angle); 

255 projectlndex = 0; 

256 coordx = topCoordx; 

257 coordy = topCoordy; 

258 LineEngine(pict,topX f topY # bottomX,bottomY,0,BaseLinePiston); 

259 topCount = projectlndex; 
260 

261 /* Compute the ratios plot */ 

262 for (i=0;i< half MaskWidth; 4- + i) 

263 ratios[i] = 0; 

264 f or (i = to pCount-h a If MaskWidth; i < topCount; + + i) 

265 ratios[i] = 0; 

266 for (i = 0, inside =0;i < halfMaskWidth*2+ 1 ; + + i) 

267 inside + = blackPixelsli]; 

268 for(i = halfMaskWidth;i<topCount-halfMaskWidth; + +i){ 

269 ratios[i] = (float)bIackEdgePixels[i]/inside; 

270 inside -= blackPixels[i-halfMaskWidth]; 

271 inside + = blackPixels[i + half MaskWidth]; 

272 } 
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273 

274 /* Compute the threhold for the black edge to black pixel ratio */ 

275 maxValue = ratios[0J; 

276 for(i = 0;i<topCount;++i){ 

277 if (ratios[i]> maxValue) 

278 maxValue = ratios[i]; 

279 } 
280 

281 baseThresh = maxValue*BASE PERCENTILE; 

282 printfCbaseThresh = % 3.3f\n" , baseThresh); 
283 

284 /* Get the coordinates of the baselines and toplines by finding peaks in the 

285 * ratios projection. */ 

286 finalCount = 0; 

287 upState = TRUE; 

288 for (i = 0; i<topCount; + + i) { 

289 thisRatio = ratios[i]; 

290 if (thisRatio < baseThresh) 

291 thisRatio = 0; 

292 if (upState) { 

293 if (thisRatio < lastRatio) { 

294 finallndex[fina!Count] = i; 

295 finalCount++; 

296 upState = FALSE; 

297 } 

298 } 

299 else { 

300 /* upState = = FALSE */ 

301 if (thisRatio > lastRatio) 

302 upState = TRUE; 

303 } 

304 lastRatio = thisRatio; 

305 if (f inalCou nt = = MAXJJASE LINES) { 

306 fprintf(stderr, M Warning: found too many baselinesAn"); 

307 fprintf(stderr, "Ignoring remaining baselinesAn"); 

308 break; 

309 } 

310 } 
311 

312 /* Histogram the distances between adjacent peaks */ 

313 f or (i = 0; i < M AX_HtST JIZE; h[i + + ] = 0); 

314 for(i = 0;i<finalCount-1;++i){ 

315 intd; 

316 d = finallndex[i+ 1]-finallndex[i]; 

317 if (d<MAX HIST SIZE) 

318 h[d]++; 

319 } 
320 

321 /* Find the mode of the adjacent distances that is above MIN_MODE */ 

322 modeValue = h[MINJv10DE]; 

323 modelndex = MIN MODE; 

324 for (i = MIN_MODE;i< MAX.HIST_SIZE; + + i) 

325 if (h[i)> modeValue) { 

326 modeValue = h[i]; 

327 modelndex = i; 
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328 } 
329 

330 /* Set the mask width to half of the most common spacing of largest peaks */ 

331 maskWidth = irint(modelndex*0.80); 

332 printf("maskWidth = %d.\n\maskWidth); 
333 

334 for {i = 0;i<topCount;newSignal[i + + ]=0); 

335 for (i = 0;i<topCount-maskWidth; + + i) 

336 newSignallMaxOnlnterval(ratios,i,i+ maskWidth)] ++; 
337 

338 /* Plot the baseline contour if requested */ 

339 if(plotFi!e! = NULL) 

340 outfile = PiotBaselineContour(plotFile,topCount,ratios,rew5ignal # baseThresh); 
341 

342 /* Pick off the new peaks */ 

343 /* Compute the threhold for the black edge to black pixel ratio */ 

344 maxValue = newSignal[0]; 

345 f o r (i = 0; i < topCount; + + i) { 

346 if (newSignal[i]> maxValue) 

347 maxValue = newSignalfi]; 

348 } 
349 

350 baseThresh = maxValue* 0.80; 

351 printf( M baseThresh = %3.3f\n",baseThresh); 
352 

353 /* Get the coordinates of the baselines and toplines by finding peaks in the 

354 * ratios projection. */ 

355 finalCount = 0; 

356 upState = TRUE; 

357 for(i = 0;i<topCount;+ +i){ 

358 thisRatio = newSignal[i]; 

359 if (thisRatio < baseThresh) 

360 thisRatio = 0; 

361 if (upState) { 

362 if (thisRatio < lastRatio) { 

363 finalCoordx[finalCount] = topCoordx[i]; 

364 finalCoordy[finalCount] = topCoordylij; 

365 finallndex[finalCount] = i; 

366 finalCount+ + ; , 

367 upState = FALSE; 

368 } 

369 } 

370 else{ 

371 /* upState = = FALSE */ 

372 if (thisRatio > lastRatio) 

373 upState = TRUE; 

374 } 

375 lastRatio = thisRatio; 

376 if (finalCount = = MAXJ5ASELINES) { 

377 fprintf(stderr, "Warning: found too many baselines.\n"); 

378 fprlntf(stderr, w lgnoring remaining baselinesAn"); 

379 break; 

380 } 

381 } 
382 
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383 

384 /* */ 

385 

386 if (finalCount&1) 

387 -finalCount; /* Only take an even number of lines*/ 

388 for (totalDista nee = 0,i = 0 ,j = 0; i <f inalCount; i + = 2) { 

389 topX = finalCoordx[i]; 

390 topY = finalCoordy[i]; 

391 bottomX = finalCoordxIi + 1]; 

392 bottomY = f inalCoordy[i + 1]; 

393 totalDistance + = distance(topX/topY,bottomX,bottomY); 

394 j+=2; 

395 } 

396 averageDistance = totalDistance / (finalCount/2)* MINJ.INE HEIGHT FRACTION; 

397 for(i = 0,j = 0;i<finalCount;i + = 2){ 

398 topX = finalCoordx[i]; 

399 topY = f inalCoordyH]; 

400 toplndex = finallndexfi]; 

401 bottomX = finalCoordx[i + 1]; 

402 bottomY = finalCoordyli + 1 j; 

403 bottomlndex = finallndexli + 1]; 

404 finalCoordx[jl = topX; 

405 finalCoordyfj] = topY; 

406 finallndex[j] = toplndex; 

407 finalCoordx[j+1] = bottomX; 

408 f inalCoordylj + 1] = bottomY; 

409 finallndex[j+ 1] = bottomlndex; 

410 if (distance(topX J topY,bottomX,bottomY)> averageDistance) 

411 j+=2; 

412 } 

413 #ifdef foo 

414 *count = j; 

415 VeturnCoordx = finalCoordx; 

416 *returnCoordy = finalCoordy; 

417 #endif 

418 result = nil; 

419 for (i=i-1;i>-0;-i){ 

420 push(MakePoint(finalCoordxIi],finaICoordy[il) # result); 

421 } 
422 

423 if (plotFilel= NULL){ 

424 fprintf<outfile,"\n0 %f\nVbaseThresh); 

425 for(i = 0;i<j;t+=2){ 

426 fprintf(outfile,"%d °/of\n%d Vof\n%d %f\n%d %f\n", 

427 finallndexfiL-baseThresh, 

428 finallndex[iL-2*baseThresh, 

429 finallndex[i + 1],-2*baseThresh, 

430 finallndex[i + 1],-baseThresh); 

431 } 

432 fprintf(outfile, ^-Baselines"); 

433 fclose(outfile); 

434 printf ("Done writing baseline plot fileAn"); 

435 } 
436 

437 return result; 
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438 } 
439 

440 void DrawBaseLinesfPicture pict, List pointList, double angle) 

441 #ifdeffoo 

442 int countjnt *coordx,int *coordy,double angle) 

443 #endif 

444 { 

445 intmaxlength; 

446 float x2,y2,x3,y3; 

447 intx,y; 

448 Point temp; 

449 maxLength = pict- > width + pict- > height; 

450 while (iendp(pointUst)) { 

451 temp = pop(pointList); 

452 x ~ temp->x; 

453 y = temp->y; 

454 x2 - x+maxLength*cos(angle); 

455 y2 = y+ maxLength*sin(angle); 

456 x3 = x-maxLength*cos(angle); 

457 y3 = y-maxLength*sin(angle); 

458 DrawLine(pict # x i y,(int)x2 ( (int)y2,0xff); 

459 DrawLinetpictx^dntJxS^intJyS^Oxff); 

460 } 

461 } 
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Aug 25 19:48 1991 newBlobify.c 



1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

12 

13 

14 

15 

16 

17 

18 

19 

20 

21 

22 

23 

24 

25 

26 

27 

28 

29 

30 

31 

32 

33 

34 

35 

36 

37 

38 

39 

40 

41 

42 

43 

44 

45 

46 

47 

48 

49 

50 

51 

52 



#include <stdio.h> 
#include <math.h> 
#inc!ude "mylib.h" 
#include "blobify.h" 



#define MAXJCERNAL.SIZE (40) 
extern int irint(double); 

static UCHAR bitmasksfl = {0x80,0x40,0x20,0x10,0x8,0x4,0x2,0x1}; 
UCHAR *addres$(Picture pict,f loat x,f loat y) 
return pict->data+irint(y)*pict->uchar - width + (irint(x)>>3); 

UCHAR mask(f loat x) 

static masks[] = {0x80,0x40,0x20,0x10,8,4,2,1}; 
return masks[irint(x)%8]; 

int X(float x) 
return irint(x); 

ntY(floaty) 
return irint(y); 

Picture NewBlobify(Picture old,int halfMaskWidth # double threshoid,double angle) 

Picture new; 

int index; 

float x,y,xinc,yinc; 

UCHAR *kernalPtr[MAX KERNAL SIZE],*kp[MAX KERNAL SIZE]; 

UCHAR kernalMask[MAX KERNAL SIZELkmiMAX KERNAL* SIZE]; 

int kerna!X[MAX_KERNAL_SIZE] l kernalY(MAX_KERNAL_SlZE]; 

intkx[MAX KERNAL SIZE],ky[MAX KERNAL SIZE]; 

UCHAR k b[ MAX_K E RN AL_S IZE] ; 

UCHAR *dest; 

UCHAR dm; 

int tvaljjfkjnside; 

intwidth,height,ucharWidth,maskWidth; 

if (halfMaskWidth*2+1 > MAX_KERNAL_SIZE) 
DoError(' f Blobify: mask is too large.\n",NULL); 
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53 tval = irint(threshold*(halfMaskWidth*2+1)); 
54 

55 width = old->width; 

56 height = old-> height; 

57 ucharWidth = old->uchar width; 
58 

59 new= newj3ict<width,height,1); 
60 

61 xinc = cos(angle); 

62 yinc = sin(angle); 

63 index = 0; 

64 kernalPtrfindex] = address(old,halfMaskWidth,halfMaskWidth); 

65 /* 

66 ^ kerna!X[indexJ = X(halfMaskWidth); 

67 kerna!Y[index] = Y(half MaskWidth); 

68 */ 

69 kernalMask[index+ +] = mask(halfMaskWidth); 

70 f or (i = 0,x = 0,y = 0; i < ha If MaskWidth; + + i) { 

71 x+=xinc; 

72 y+=yinc; 

73 kernalPtr[index] = address(old,half MaskWidth +x,half MaskWidth +y); 

74 /* 

75 kernalXfindex] = X(haif MaskWidth +x); 

76 kernalYlindex] = Y(halfMaskWidth+y); 

77 */ 

78 kernalMask[index+ +) = mask(half MaskWidth* x); 

79 kernalPtr[index] = address(old,halfMaskWidth-x,halfMaskWidtlvy); 

80 /* 

81 kernalX[index] = X(halfMaskWidth-x); 

82 kernalY[index] = Y(halfMaskWidth-y); 

83 */ 

84 kerna1Mask[index+ +] = mask(halfMaskWidth-x); 

85 } 
86 

87 maskWidth = 2* half MaskWidth + 1; 
88 

89 for(j = 0;j<height-maskWidth;-l-+j){ 

90 for{i = 0;i<index; + +i){ 

91 kp[i] = kernalf>tr[i]+j*ucharWidth; 

92 km[i] = kernalMask[i]; 

93 kb[i] = *kp[l] + + ; 

94 /* 

95 kx|i] = kernalXfi); 

96 ky(i] = kernalY[i]+j; 

97 */ 

98 } 

99 dest = new->data + (j + halfMaskWidth)*ucharWidth + (halfMaskWidth> >3); 

100 dm = mask(halfMaskWidth); 
101 

102 for (k = 0; k < width-maskWidth; + + k) { 

103 if(dm==0){ 

104 dm = 0x80; 

105 dest++; 

106 } 

1 07 for (i =0,inside - 0; i< index; + + i) { 
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108 jf(km[i]==0){ 

109 km[i] = 0x80; 

110 kb[i] = *kp[i]++; 

111 } 

112 /* 

113 printf("(%d,%d); %d-%x%x-> %x\n\kx[i],ky[i],kb[i]&km[i],^ 

114 kx[i]++; 

115 */ 

116 if(kb[i]&km[i]) 

117 + -h inside; 

118 km[i]>>=1; 

119 } 

120 /* 

121 printfr%d\n\n\inside); 

122 */ 

123 if (inside > tval) 

124 *dest|=dm; 

125 dm >> = 1; 

126 } 
127 

128 } 
129 

130 return new; 
131 

132 } 
133 

134 #ifdef TRYMAIN 

135 void main(argcargv) 

136 intargc; 

137 char**argv; 

138 { 

139 char *infite,*outfile; 

140 Picture o!d,new; 

141 inthalfMaskSize; 

142 float threshold; 

143 float angle; 
144 

145 DefArg("%s %s ttd %f %f Vinfile outfile halfMaskSize threshold angle", 

146 &infile,&outfNe,&halfMaskSize,&threshold,&angle); 

147 . ScanArgs(argcargv); 
148 

149 printff Loading %s . . , tt Jnfile); 

150 old = load_pict(infile); 

151 new = NewBlobify(old,ha!fMaskSize,threshold,angle); 

152 write pict(outfile,new); 

153 } 

154 #endif 
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Aug 15 06:41 1991 newContour.c 

1 #include<stdio.h> 

2 #lnclude <values.h> 

3 #include <math.h> 

4 #include "boolean.rT 

5 #include M types.h M 

6 #include "pict.h" 

7 #include"]ines.h M 

8 #include "lists.h" 

9 #include"dict.h" 

10 #include "diff.h" 

1 1 #include "fontNorm.h" 
12 

13 extern Picture thePict; /* Picture used for annotated shapes */ 
14 

15 /* The following are misc. definitionas and routines havine to do with 

16 * vectors and coordinates. */ 
17 

18 typedef struct { 

19 double x; 

20 double y; 

21 } DPointBody *DPoint; 
22 

23 

24 static double Dot(OPoint a,DPoint b) 

25 { 

26 /* printfCDot: (%lf,%lf)*(%lf,%lf) = %IAn tt ,a->x,a->y,b->x,b->y,a->x*b->x + 
a->y*b->y); */ 

27 return a->x*b->x + a->y*b->y; 

28 ] 
29 

30 static DPoint PolarToCartesian (double angie,double radius) 

31 { 

32 DPoint result = (DPoint)calloc(1,sizeof(DPointBody)); 

33 if (result = = NULL) 

34 DoError("Dot: cannot allocate space\n w ); 

35 result->x = cos(angle); 

36 result->y = sin(angle); 

37 return result; 

38 } 
39 

40 static DPoint Normal(DPoint a) 

41 { 

42 DPoint result = (DPointJcallocO.sizeoftDPointBody)); 

43 if (result = = NULL) 

44 DoError("Dot: cannot allocate space\n"); 

45 result->x = -a->y; 

46 result- >y = a->x; 

47 return result; 

48 } 
49 

50 
51 
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52 /* This piston scans pict up and down from the top and bottom of the 

53 * bounding box, looking for the highest and lowest pixels in the 

54 * word. If thePict is not NULL, these pixels will be colored as 4 

55 *inthePtct. */ 

56 static intstartX; 

57 static int startY; 

58 static double stopDistance; 

59 static intlastY; 

60 static BOOLEAN valid; 

61 BOOLEAN TracePiston(Picture pict, int x, int y, BOOLEAN test, UCHAR color) 

62 { 

63 double distance; 

64 if (test) { 

65 distance = sq rt((double)(sta rtX-x)*(sta rtX-x) + (start Y-y)* (start Y-y)); 

66 if (distance<stopDistance) { 

67 /* lastY = stopDistance - distance; */ 

68 lastY = distance; 
69 

70 if (ReadPixel(pict,x,y)) { 
71 

72 if(thePict) 

73 WritePixeKthePict^y^); 
74 

75 valid = TRUE; 

76 return FALSE; 

77 }else{ 

78 valid = FALSE; 

79 return test; 

80 } 

81 } 

82 else { 

83 if(thePict) 

84 WritePixeI(thePict,x,y,4); 

85 #ifdeffoo 

86 lastY = distance; /**** Used to be 0****/ 

87 #endif 

88 lastY = HIT THE BOX; 

89 valid = FALSE; 

90 return FALSE; 

91 } 

92 } 

93 return test; 

94 } 
95 

96 

97 /* This piston moves from left to right across a bounding box, calling 

98 * trace piston and saving its output in topY, baseY, and bothX. */ 

99 #define MAX_SHELL_LENGTH 400 

1 00 static int numberOf Legs; 

101 static int topY(MAX SHELL LENGTH]; 

102 static int baseY[MAX_SHELL_LENGTH]; 

103 static int bothX[MAX_SHELL_LENGTH]; 
104 

105 static double leftDistance; 

106 static DPoint lineVector; 
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107 static intdownX; 

108 static int downY; 

109 static double boxTopDistance; 

1 10 static double boxBaseDistance; 

1 1 1 BOOLEAN ShellPiston(Picture pict. int x, int y, BOOLEAN test, UCHAR color) 

112 { 

113 int xDistance; 

1 1 4 DPointBody thisPoint; 

115 if (test) { 

116 if (numberOfLegs > = MAX SHELL LENGTH) 

117 return FALSE; 

118 thisPointx = x; 

119 thisPointy = y; 

120 xDistance = Dot(&thisPoint,lirieVector) - leftDistance; 

121 stopDistance = boxTopDistance; 

122 startX = x; 

123 startY^y; 

124 LineEngine(pict,x.y,x-f downX,y+downY,OJracePiston); 

125 bothX[numberOfLegs) = xDistance; 

126 if (valid) 

127 topY[numberOfLegs] = lastY; 

128 else 

129 topYtnumberOfLegs] = HIT THE BOX; 
130 

131 stopDistance = boxBaseDistance; 

132 startX = x+downX; 

133 startY = y+downY; 

134 LineEngine(pirt,x+downX,y+downY J x,y,OJracePiston); 

135 if (valid) 

136 baseYInumberOfLegsJ = lastY; 

137 else 

138 baseY[numberOfLegs] = HIT_THE_BOX; 

139 numberOfLegs+ + ; 
140 

141 } 

142 return test; 

143 ) 
144 

1 45 This function, finds the upper and lower contours corresponding 

146 * to a word within a bounding box. */ 

147 void MakeShe!l(Picture pict,Box box # 

148 Dictionary diet, int dtctEntry) 

149 { 

1 50 DPoint normalVector; 

151 DPointBody temp; 

1 52 double boxTop # boxBase; 

153 int rightX.rightY; 
154 

155 lineVector = PolarToCartesian(box->angle,1); 

1 56 normalVector = Normal(iineVector); 

1 57 temp.x = box- > x; 

158 temp.y = box->y; 

159 boxTop = Dot(&temp,normalVector); 

160 box->pageY = irint(boxTop); 

161 boxBase = boxTop + box-> height; 
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162 
163 

164 /* CHANGE CHANGE CHANGE CHANGE CHANGE CHANGE CHANGE CHANGE CHANGE */ 

1 65 boxTopDistance = boxBase - boxTop; 

1 66 boxBaseDistance = boxBase - boxTop; 

167 /* CHANGE CHANGE CHANGE CHANGE CHANGE CHANGE CHANGE CHANGE CHANGE */ 
168 

169 downX = box->height*cos(box->angle + M_PI/2); 

170 downY =s box- >height*sin(box-> angle + M~PI/2); 
171 

172 rightX = box->width*cos(box->angle); 

173 rightY = box->width*sin(box->angle); 
174 

175 numberOfLegs = 0; 

176 leftDistance = Dot(&temp, line Vector); 

1 77 box- > pageX = irint(leftDistance); 

178 #ifdeffoo 

179 malloc_verifyO; 

180 #endif 

181 LineEngine(pict,box->x,box->y, 

182 box^>x+rightX,box->y + rigntY,0, 

183 ShellPiston); 
184 

185 /* CHANGE CHANGE CHANGE CHANGE CHANGE CHANGE CHANGE CHANGE CHANGE */ 

186 { 

187 inti; 

188 for (i = 0;i< numberOfLegs; + + i){ 

189 if (*{topY+i)! = HlTTHE_BOX) 

190 *(topY + i) + = boxTop; 

191 if (*(baseY + i)l = HITJHE.BOX) 

192 *(baseY + i) = boxBase -* (base Y+i); 

193 } 

194 } 

195 /* CHANGE CHANGE CHANGE CHANGE CHANGE CHANGE CHANGE CHANGE CHANGE */ 
196 

197 #ifdeffoo 

198 malloc_verify0; 

199 #endif~ 

200 StoreRawOutlinePair(dict,dictEntry f box,bothX,topY, 

201 baseY,numberOfLegs); 

202 } 
203 

204 BOOLEAN OnABaseLine(Box box f List baseLinePoints) 

205 { 

206 DPoint lineVector,normalVector; 

207 DPointBody temp; 

208 double boxTop, boxBase,top ( base; 

209 Point topPoint, basePoint; 
210 

211 lineVector = PolarToCartesian(box->angle,1); 

212 normalVector = Normal(lineVector); 

213 temp.x = box->x; 

214 temp.y = box->y; 

215 boxTop = Dot(normalVector,&temp); 

216 boxBase = boxTop 4- box- > height; 
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217 

218 while (!endp(baseLinePoints)) { 

219 topPoint = pop(baseLinePoints); 

220 basePoint = pop(baseLinePoints); 

221 temp.x = topPoint->x; 

222 temp.y = topPoint->y; 

223 top = Dot(normalVector,&temp); 

224 temp.x = basePoint->x; 

225 temp.y = basePoint->y; 

226 base = Dot(normalVector,&temp); 
227 

228 if ((boxTop> =top && boxTop < = base) || /* box top is between */ 

229 (boxBase> =top && boxBase < = base) || /* box bottom is between */ 

230 (top > = boxTop && top < = boxBase)) /* both lines inside box */ 

231 return TRUE; 

232 } 

233 return FALSE; 

234 } 
235 

236 BOOLEAN BoxToShell(Picture pict ( Box box,List baseLinePoints, 

237 Dictionary diet, int dictEntry) 

238 { 

239 Point topPoint,bottomPoi nt; 
240 

241 if (OnABaseLine(box,baseLinePoints)) { 

242 MakeShell(pict,box,dict,dictEntry); 

243 return TRUE; 

244 } 

245 else 

246 return FALSE; 

247 } 
248 

249 #define MAX_SHAPES 1000 

250 void BarBoxList(Picture pict.List boxLis^List baseLinePoints, 

251 char *filename,char *infoString l NormalizationDescriptor *nd) 

253 Dictionary diet; 

254 int count = 0; 

255 long int location; 
256 

257 diet = NewDict(MAX_5HAPES); 

258 dict->lnfoString = infoString; 
259 

260 while (lendp(boxList)) { 

261 #ifdeffoo 

262 if(BoxToShell(pict, 

263 (Box)pop(boxList), 

264 baseLinePoints, 

265 diet, 

266 count)) 

267 ++count; 

268 #endif 

269 /* Change 8/8/91 

270 * All boxes are stored in the dictionary. 

271 *The post processing stage in newFontNorm.c will weed out boxes */ 
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272 MakeShell(pict,(Box)pop(boxLi5t),dict,count); 

273 ++count; 

274 /* End of change 8/8/91 */ 

275 if (count> = MAXJHAPES) { 

276 printfC Maximum dictionary size exceededAn"); 

277 printf(" Ignoring rest of shapesAn"); 

278 break; 

279 } 

280 } 

281 dict->numberOfEntries = count; 

282 PageStatistics(dict," statistics", nd); 

283 /* PostProcess(dict); */ 

284 WriteDictionary(dict,filename); 

285 } 
286 
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Jan 11 17:07 1991 newDiff2.c 



1 #include <stdio.h> 

2 #include M boo!ean.h H 

3 #indude l, types.h u 

4 #include "error.h" 

5 #inciude"picth M 

6 #inc!ude u dict.h w 

7 #include u diff.h M 
8 

9 /* Given the names of two dictionary files, compute the squared difference 

10 * between every pair of shapes in the cross product of the dictionaries. 

1 1 * The result is a matrix printed to stdout The width and height are 

1 2 * followed by the matrix entries in row major order. The output is in 

1 3 * ascii to facilitate reading by a Symbolics. */ 

14 Picture CompareDictionaries(char *file1,char *file2) 

15 { 

16 Dictionary dict1,dict2; 

17 Picture pict; 

18 intx.y; 

19 dictl = ReadDictionary(filel); /* height */ 

20 dict2 = ReadDictionary(file2); /* width */ 

21 pict = newj3ict(dict2->numberOf Entries, 

22 dict1->numberOf Entries, 

23 32); 

24 f or (y = 0;y < p ict- > height; + + y) 

25 for(x = 0;x<pict->width; + +x){ 

26 printf( M (%d,%d) M ,y,x); 

27 *{(fioat*)(pict-> data) + pict- >width*y + x) = 

28 DiffPair(*(dict1->outlines + y), 

29 *(dict2->outlines+x)); 

30 } 

31 return pict; 

32 } 
33 

34 void WritePictureAsAscii(Picture pict,char *filename) 

35 { 

36 FILE *fp; 

37 intx,y; 

38 intcount=1; 

39 if ((fp = fopen(filename/w M ))= = NULL) 

40 DoError("WritePictureAsAscii: error opening output file\n'\NULL); 

41 fprintf(fp, ,, %d\n%d\n",pict->width,pict->height); 

42 for (y=0;y<pict-> height; + +y) 

43 for (x = 0;x< pict->width; + H-x) { 

44 fprintf(fp, tt %f "/(((float *)pict-> data) + +)); 

45 if (!((count++)°/o5» 

46 fprintf(fp,"\n"); 

47 } 

48 fprintf(fp / "\n M ); 

49 fciose(fp); 

50 } 
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Aug 26 17:20 1991 newMain.c 



1 #include <stdio.h> 

2 #include <values.h> 

3 #include <math.h> 

4 #include ,, nnisc.h ,, 

5 #include "boolean.h" 

6 #indude "error.rT 

7 #include "types.h" 

8 #include "picth" 

9 #include"lists.h" 

10 #include"lines.h" 

11 #include "orient.h" 

12 #include M baselines.h M 

13 ^nclude-blobify.h" 

14 #include "boxes.h" 

15 #include N dicth M 

16 #include M diff.h M 

17 #include "newContour.h" 

18 #include "numbers.h" 
19 

20 #defineTRY 

21 #ifdefTRY 

22 Picture thePict; 

23 #endif 
24 

25 void OrawMidd!eLines(Picture pict, List pointList, double angle) 

26 { 

27 int maxLength; 

28 int xCjVCfXBo^xTop^Bot.yTop; 

29 Point temp; 

30 float x2,y2,x3,y3; 

31 inti,len; 

32 maxLength = pict->width + pict-> height; 

33 len = ListLength(pointList); 

34 pop(pointList); 

35 for(i = 1;i<len-1;i+=2){ 

36 temp = pop(pointList); 

37 xTop = temp->x; 

38 yTop = temp->y; 

39 temp = pop(pointList); 

40 xBot = temp->x; 

41 yBot = temp->y; 

42 xc = (xBot+xTop)/2; 

43 yc = (yBot+yTop)/2; 

44 x2 = xc+maxLength*cos(ang1e); 

45 y2 = yc-f maxLength*sin(angle); 

46 x3 = xc-maxLength*cos(angle); 

47 y3 = yc-maxLength*sin(angle); 

48 DrawLine(pict,xc,yc,(int)x2,(int)y2,0); 

49 DrawLine(pict ( xc # yc l (int)x3 l (int)y3 l 0); 

50 } 

51 } 
52 
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53 void DrawBoxList(Picture pict,List boxList) 

54 { 

55 while (!endp(boxList)){ 

56 DrawBox(pict,(Box)pop(boxList)); 

57 } 

58 } 
59 

60 void Labe!Shapes{Picture pict.Dictionary diet) 

61 { 

62 inti; 

63 Box box; 
64 

65 for (i = 0; i < diet- > numberOf Entries; + + i) { 

66 box = (*(dict->outlines+i))->box; 

67 DrawColorBox(pict,box,3); 

68 DrawNumber(pict,box->x,box->y,2 r (float)box->height/2J); 

69 } 

70 } 
71 

72 double FixAngle(doubIe angle) 

73 { 

74 if (angle > M_PI/2 && angle < 1.5*MJ>I) 

75 return angle-MJ*!; 

76 else 

77 return angle; 

78 } 
79 

80 int ScanlntArg(int argechar **argv,int index) 

81 { 

82 if (index<argc) 

83 return atoi(argv[index]); 

84 else 

85 DoError( w Expected an integer argument\n M ,NULL); 

86 } 
87 

88 float ScanFloatArg(int argechar **argv,int index) 

89 { 

90 if (index <argc) 

91 return atof(argv[index]); 

92 else 

93 DoError("Expected a floating point argumentVn'^ULL); 

94 } 
95 

96 char *ScanStringArg(int argechar **argv,int index) 

97 { 

98 if (index <argc) 

99 return argv[indexj; 

100 else 

101 DoErrorf Expected a string argument\n",NULL); 

102 } 
103 

104 void main(argc,argv) 

105 intargc; 

106 char**argv; 

107 { 
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108 char*infile; 

109 intcoarseDirections,coarseSamples,fineDirections f fineSamples; 

1 10 Picture pict,newPict,finalPict; 

1 1 1 float coarseAngle,mediumAng!e,fineAng(e; 

1 1 2 float coarseError,mediumError,fineError; 

1 13 List baselines,boxList; 

114 int maskWidth; 

115 float blobThreshold; 

116 int i; 

117 char *shapesFile, *drawBaselinesFile; 

118 char *drawBoxesFile,*plotFile,*plotOrientFile; 

119 char *drawColorBoxesFile,*drawBlobsFile; 

120 char*flag; 

121 BOOLEAN doOrientation,doBaselines,doBoxes,doShapes,drawBaselines,drawBoxes; 

122 BOOLEAN plotBaselines,plotOrientation ( drawColorBoxes,drawBlobs; 

123 BOOLEAN 

noXHeightNorm^oAscenderNorm^ontOrientation.doBlobThreshold^oMaskWidth; 

1 24 NormalizationDescriptor nd; 
125 

1 26 Def Ar g( * % s " i nfi le " ,&i nf ile) ; 

127 DefOptionC-orientation %f", "-orientation (page orientation in radians)", 

128 &dontOrientation,&fineAngle); 

129 DefOption("-findOrientation" f "-findOrientation",&doOrientation); 

130 DefOption( M -plotOrientation %s"/'-plotOrientation (file top plot xgraph format image 
to)-, 

131 &plotOrientation,&plotOrientFile); 

132 DefOptionC'-maskWidth %d M /'-maskWidth (integer half mask width)", 

133 &doMaskWidth,&maskWidth); 

134 DefOption("-blobThreshold °/of" /'-blobThreshold (float on/off threshold)*, 

1 35 &doBlobThreshold,&blobThreshold); 

1 36 DefOption("-drawBlobs %s'\ M -drawBlobs (file to output image 
to)*\&drawB!obs,&drawBlobsFile); 

1 37 DefOptionC'-drawBaselines %s V-drawBaselines (file to output image 
to)\&drawBaselines, 

138 &drawBaselinesFile); 

139 DefOptionC'-plotBaseltnes %sV-plotBaselines (file to plot xgrapgh format baselines to)", 

140 &plotBaselines,&plotFile); 

141 DefOption( M -drawBoxes %s", w -drawBoxes (file to output image 
to) M ,&drawBoxes,&drawBoxesFile); 

142 DefOption("-shapeFunctions %s"/-shapeFunctions (file to output shape functions to)", 

143 &doShapes,&shapesFile); 

144 DefOption("-annotatedShapes %s H , "-annotated Shapes (file to output image to)", 

145 &drawColorBoxes,&drawColorBoxesFile); 

146 DefOption( M -noAscenderNorm"/-noAscenderNorm ,, ,&noAscenderNorm); 

147 DefOption( l, -noXHeightNorm u ,"-noXHeightNorm l, ,&noXHeightNorm); 
148 

149 i = 2; 

150 coarseDirections = 72; 

151 coarseSamples = 400; 

152 fineDirections = 40; 

153 fineSamples = 10; 

154 maskWidth = 3; 

155 blobThreshold = 0.01; 
156 

157 ScanArgs(argcargv); 
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158 if (dontOrientation) 

159 doOrientation = FALSE; 
160 

161 nd.noXHeightNormalize = noXHeightNorm; 

162 nd.noAscenderNormalize = noAscenderNorm; 
163 

164 printfrLoading %s. ..\n\infile); 

165 pict = loadjoict(infile); 

166 if (pict-> depth ! = 1) 

167 DoErrorCerror: only depth 1 issupported\n",NULL); 
168 

169 if (drawBaselines || drawBoxes) 

170 finalPict = new pict(pict->width,pict-> height, pict- > depth); 
171 

172 if (doOrientation) { 

173 #define NUMBER.OF ANGLES 180 

174 #define SAMPLES_PER_ANGLE 10 

175 #define BlN_ERROR4 

176 printf( M Finding coarse orientationAn"); 

177 coarseAngle = NewFine(pict,SAMPLES PER ANGLE,NUMBER OF ANGLES, 

178 0,MJ>l,NULL); 

179 coarseError = (mJp!-0)/NUMBER^OF_ANGLES; 

180 printffCoarse angle: %f(%f)\n w rcoarseAngle,coarseAngle/M_PI*180); 

181 printf(" Coarse error; % f(%f)\n", coarse Error, co a rseErr or/ M_PI* 180); 
182 

183 mediumAngle = NewFine(pict,SAMPLESJ > ER_ANGLE,NUMBER_OF_ANGLES, 

184 coarseAngle-BIN_ERROR*coarseError, 

185 coarseAngle + BIN ERROR*coarseError, 

186 NULL); 

187 mediumError = 2*BIN_ERROR*coarseError/NUMBER_OF_ANGLES; 

188 printf("Medium angle: %f(%f)\n M 1 mediumAngle J mediumAngle/M - PI*180); 

189 printf( M Medium error: %f( & /of)\n" t mediumError / mediumError/M Pl*180); 
190 

191 

192 fineAngle = NewFine{pict,SAM PLES_PER_ANG LE, N UMBER_OF_ANG LES, 

193 mediurnAngle-1S*mediurnError,mediurnAngle+15*mediurnError f 

194 plotOrientFile); 

195 fineError = 30*mediumError/NUMBER_OF_ANGLES; 

196 fineAngle = FixAngle(fineAngle); 

197 printf("Fine angle: %f(%f)\n\fineAng1e,fineAngle/MJ , l*180); 

198 printffFine error: %f(°/of)\n",fineError f fineError/M Pl"*180); 

199 } 
200 

201 printf(" Adjusted angle: °/olf\n",fineAngle); 
202 

203 #ifdeffoo 

204 printffFinding baselines\n"); 

205 baselines = BaseLines(pict,fineAngle,plotBaselines?plotFile:NULL); 
206 

207 if (drawBaselines) { 

208 CopyPicture(finalPict,pict); 

209 DrawBaseLines(fina!Pict,baselines,fineAngle); 

210 write pict(drawBaselinesFile,finalPict); 

211 } 
212 
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213 printf( M Blobifying\n"); 

214 newPict = Blobify{pict,maskWidth,blobThreshold); 

215 #endif 

216. prmtfrNewBlobify\n u ); 

217 /* newPict = NewBlobify(pict ( ma5kWidth,blobThreshold,fineAngle); */ 

218 newPict = Blobify(pict,maskWidth,blobThreshold); 

219 printffFinding baselines\n"); 

220 baselines = BaseLines(newPict,fineAngle,plotBaselines?plotFile:NULL); 

221 if (drawBaselines) { 

222 CopyPicture(finalPict,pict); 

223 DrawBaseLines(finalPict i baselines J fineAngle); 

224 write_pict(drawBaselinesFile,finalPict); 

225 } 
226 

227 

228 DrawMiddleLines(newPkt,baselines,fineAngle); 

229 if (drawBlobs) 

230 write_pict(drawBlobsFile,newPict); 

231 printf(~Finding boxes\n"); 

232 boxList = FindBorders(newPict f fineAngte); 
233 

234 if (drawBoxes) { 

235 CopyPicture(finalPictpict); 

236 DrawBoxList(finalPict,boxList); 

237 write_pict(drawBoxesFile J finalPict); 

238 } 
239 

240 if (doShapes){ 

241 ColorMap cmap; 

242 intx,y; 
243 

244 if (drawColorBoxes) { 

245 thePict = new_pict(pict-> width, pi ct->height,8); 

246 cmap = NewColorMap(6); /* black, white, and 16 colors */ 

247 WriteColorValue(cmap,0,0, 128,0); /* Olive */ 

248 WriteColorValue(cmap,1, 0,0,0); /* Black */ 

249 WriteColorValue(cmap,2,255,255,255); /* White */ 

250 WriteColorValue(cmap,3,0,0,255); /* Blue */ 

251 WriteColorValue(cmap,4,255,255,80); /* Yellow */ 

252 WriteColorValue(cmap,5,128,0,0); /* Blood */ 

253 thePict->cmap = cmap; 

254 for(y=0;y<pict->height; + +y) 

255 for (x = 0;x<pict->width; + + x) 

256 WritePixel(thePict,x,y,ReadPixel(pict,x,y)?0:1); 

257 } 

258 else 

259 thePict = NULL; /* Important */ 

260 - 

261 printfCTracing outlines\n M ); 

262 BarBoxList(pict,boxList,baselines,shapesFile,ArgListToString(argc,argv),&nd); 
263 

264 if (drawColorBoxes) { 

265 Dictionary diet; 
266 

267 diet = ReadDictionary(shapesFile); 
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268 LabelShapes(thePict,dict); 

269 write pict(drawColorBoxesFile,thePict); 

270 } 

271 } 

272 } 
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Jan 11 17:071991 numbers.c 



1 #include "stdio.h" 

2 #include "boolean.h H 

3 #include w picth a 

4 #include*lines.h" 
5 

6 staticfloat localScale; 

7 static int localColor; 

8 static int localX; 

9 static int localY; 

10 static Picture localPict; 
11 

12 void DrawSegment(floaty1 # float x1,floaty2,float x2) 

13 { 

14 DrawLineOocalPict^rinttlocalX+x^iocalScale), 

15 irint(localY+y1*localScale), 

1 6 ir int(localX + x2*localScale), 

17 irint(localY+y2*localScale),localColor); 

18 } 
19 

20 void DrawO(Picture pict, int x, int y, int color/float scale) 

21 { 

22 localPict = pict; 

23 localScale = scale; . 

24 localColor = color; 

25 localX = x; 

26 localY = y; 

27 DrawSegment(0,0,0,1); 

28 DrawSegmentOAU) 

29 Dra wSeg ment(0,0, 1 ,0) 

30 D ra wSeg ment(0, 1,1,1) 

31 } 
32 

33 void Draw1(Picture pict, intx, int y # int color/float scale) 

34 { 

35 localPict = pict; 

36 localScale = scale; 

37 localColor = color; 

38 localX = x; 

39 localY = y; 

40 DrawSegment(0,0.5,1,.5); 

41 } 
42 

43 void Draw2(Picture pict, int x, int y, int coior,f loat scale) 

44 { 

45 localPict = pict; 

46 localScale = scale; 

47 localColor = color; 

48 locaiX = x; 

49 localY = y; 

50 DrawSegment(0,0,0,1); 

51 DrawSegment(0,1,.5,1); 

52 DrawSegment(,5,1,.5,0); 
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53 DrawSegment(.5,0,1,0); 

54 DrawSeg ment ( 1 ,0, 1 , 1 ); 

55 } 
56 

57 void Draw3(Picture pict, int x, inty, int color t float scale) 

58 { 

59 localPict = pict; 

60 localScale = scale; 

61 localColor = color; 

62 iocalX = x; 

63 localY = y; 

64 DrawSegment(0 ( 0,0,1); 

65 DrawSegment<0,1,1,1); 

66 DrawSeg ment( 1 ,0, 1 , 1 ); 

67 DrawSegment(.5,0,.5,1); 

68 } 
69 

70 void Draw4(Picture pict, intx, inty, int color,float scale) 

71 { 

72 localPict = pict; 

73 localScale = scale; 

74 localColor = color; 

75 IocalX = x; 

76 localY = y; 

77 DrawSeg ment{0,0,.5,0); 

78 DrawSegment<0,1,t,1); 

79 DrawSegment{.5,0,.5,1); 

80 } 
81 

82 void Draw5(Picture pict t int x, int y, int co(or,f loat scale) 

83 { 

84 localPict = pict; 

85 localScale = scale; 

86 localColor = color; 

87 IocalX = x; 

88 localY = y; 

89 DrawSeg ment(0,0,0, 1 ); 

90 DrawSegment(0,0,.5 # 0); 

91 DrawSegment(.5 f 1 r .5,0); 

92 DrawSegment(.5,1,1,1); 

93 DrawSegment(1 r 0,1 ( 1); 

94 } 
95 

96 void Draw6(Picture pict, int x, int y, int color f f loat scale) 

97 { 

98 localPict = pict; 

99 localScale = scale; 

100 localColor = color; 

101 IocalX = x; 

102 localY = y; 

103 D ra wSeg ment(0 AO, 1 ) ; 

1 04 DrawSeg ment(0,0, 1 ,0); 

1 05 DrawSeg ment(.5, 1 5,0); 

106 DrawSegment(.5,l,1,1); 

1 07 DrawSeg ment( 1 ,0, 1 , 1 ) ; 
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108 } 
109 

110 void Draw7(Picture pict, int x, inty, int color/float scale) 

HI { 

112 localPict = pict; 

113 localScale = scale; 

114 localColor = color; 

115 localX = x; 

116 localY = y; 

117 DrawSegment(0,0,0,1); 

118 DrawSegment(0,1,1,1); 

119 } 
120 

1 21 void Draw8(Picture pict, int x, int y, int color/f loat scale) 

122 { 

123 localPict = pict; 

124 localScale = scale; 

125 localColor = color; 

126 localX = x; 

127 localY = y; 

128 DrawSegment(0,0,0,1); 

129 DrawSegment(0,0,1,0); 

130 DrawSegment(1,0,1,1); 

131 DrawSegment(.5,1,.5,0); 

132 DrawSegment(0,1,1,1); 

133 } 
134 

135 void Draw9(Picture pict, intx, inty, int color f f loat scale) 

136 { 

137 localPict = pict; 

138 localScale = scale; 

139 localColor = color; 

140 localX = x; 

141 localY = y; 

142 DrawSegment(0,0,0,1); 

143 DrawSegment(.5,0,.5,1); 

144 DrawSegment(0,0,.5,0); 

145 DrawSegment(0,1,1,1); 

146 } 
147 

148 typedef void DrFct(Picture pict, int x, int y, int color, float scale); 
149 

150 DrFct *DrawFunctions[] - {Draw0,Draw1,Draw2,Draw3,Draw4,Draw5,Draw6, 

151 Draw7,Draw8,Draw9); 
152 

153 void DrawNumeral(Picture pict, int x, int y, int color, float scale, int n) 

154 { 

1 55 (*DrawFunctionsln])(pict,x / y,color,scale); 

156 } 
157 

1 58 void DrawNumber(Picture pict, int x, int y, int color, float scale, int n) 

159 { 

160 chars[100]; 

161 char*ptr; 
162 
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163 sprintf 

164 ptr = s; 

165 while (*ptr!='\0'){ 

166 DrawNumeraKpict^y.colooscale/ptr-'O'); 

167 x + = irint(scale*1.5); 

168 ptr++; 

169 } 

170 } 
171 

172 #ifdef TRYMAIN 

173 mainO 

174 { 

175 Picture pict; 

176 pict = newj3ict(4O0 r 2OO,1); 

177 DrawNumber(pict,50,50, 1,20,1 2345); 

1 78 DrawNu mber(pict,50, 1 00, 1 , 1 0,67890); 

179 write_pict( n junkfile.image M ,pict); 

180 } 

181 #endif 
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Jul 2 18:48 1991 orient.c 



1 #include<stdio.h> 

2 #include <values.h> 

3 #include <math.h> 

4 #include w misc.h" 

5 #include ,, boolean.h H 

6 #indude M pict.h M 

7 #indude "orient.h" 

8 #include"lines.h" 
9 

10 

1 1 #def ine ABS(x) («x)<0)?-(x):(x)) 
12 

13 extern long random(); 
14 

15 int RandomCoordinate(int maxValue) 

16 { 

17 return (float)(randomO&Oxffff)*maxValue/Oxffff; 

18 } 
19 

20 void RandomEdgePixel(Picture pict f int *x, int *y) 

21 { 

22 while (TRUE) { 

23 *x = RandomCoordinate(pict-> width); 

24 *y = RandomCoordinate(pict-> height); 

25 if (ReadPixel(pict,*x,*y)) 

26 if (l(ReadPixel(pict,*x+1/y)&& 

27 ReadPixel(pict,*x-1/y) && 

28 ReadPixel(pict,*x,*y+ 1) && 

29 ReadPixel(pict,*x,*y-1) && 

30 ReadPixel(pict,*x + 1,*y + 1) && 

31 ReadPixel(pict,*x-1,*y-1) && 

32 ReadPixel(pict,*x+ 1,*y-1) && 

33 ReadPixel(pict,*x-1.*y+ 1))) 

34 return; 

35 } 
36 

37 ) 
38 

39 /* #define SYMTHRESH 4 */ 

40 #define SYMTHRESH 0.17453278 

41 BOOLEAN FindBestMin(float ^distances, int coarseDirections, float step, 

42 float 'orientation) 

43 { 

44 int i,j,minlndex,min2lndex; 

45 int orientationError; 

46 float minValue,min2Value; 

47 int maxBinError = irint(SYMTHRESH / step); 
48 

49 minlndex = 0; 

50 minValue = distances[0); 

51 for(i=0;i<coarseDirections;+ +i) 

52 if (distances[i]< minValue) { 
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53 minValue = distances[i]; 

54 minlndex = i; 

55 } 

56 /* Now verify that there is another minima M_PI away */ 
57 

58 min2lndex = (minlndex+coarseDirections/4)%coarseDirections; 

59 min2Value = distances[min2lndex]; 

60 for(i=Oj = min2!ndex;i<coarseDirections/2; + + i,j = (j + 1)%coarseDirections) 

61 if (distances[j]<min2Value) { 

62 min2Value = distances[jl; 

63 min2lndex=j; 

64 } 

65 orientationError = ABS((min2lndex-minlndex)°/ocoarseDirections) - 

66 coarseDirections/2; 

67 orientationError = ABS(orientationError); 

68 if (orientationError<maxBinError){ 

69 *orientation = minlndex*step; 

70 return TRUE; 

71 }else( 

72 printffOrientation error: %d %3.3f\n ,, l orientationError i 

73 orientationError*step/M_PI/2*360); 

74 printf("%3.3f:%3.3f %3.3f :%3.3f\n u ( minlndex*step,minValue r 

75 min2tndex*step,min2Value); 

76 return FALSE; 

77 } 

78 } 
79 

80 float Fine(Picture pictjnt fineSamples, intfineDirections f 

81 int coarseDirections, float coarseAngle, char *plotFile) 

82 { 

83 float coarseError; 

84 intx,y; 

85 float x2,y2; 

86 intij; 

87 float Counters; 

88 float step,angle; 

89 float maxAngle; 

90 float maxValue; 

91 float maxLength; 

92 FILE *outfile; 
93 

94 counters = (float *)calloc(fineDirections l sizeof(float)); 

95 if (counters = = NULL) { 

96 printf("Fine: cannot allocate memory\n M ); 

97 exit(-l); 

98 } 

99 /* coarseError = 2*(SYMTHRESH + 1)*2*M^PI/coarseDirections; */ 

100 coarseError = 2*SYMTHRESH; 

101 step = coarseError/fineDirections; 

102 printfffine: +/- %3.3f\n M ,fineDirections/2*step); 
103 

104 maxLength = sqrt((double)(pict->width*pict->width + 

105 pict->height*pict->height)); 

1 06 for (i = 0; i < f ineSamples; + + i) { 

1 07 RandomEdgePixel(pict,&x,&y); 
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108 angle = ~fineDirections/2*step + coarseAngle; 

1 09 for (j =0;j <fineDirection$; + + j,angle + =step) { 

110 x2 = x + maxLength*cos(angle); 

111 y2 = y + maxLength*sin(angle); 

112 counters[j] + = CountLine(pict,x,y,(int)x2,(int)y2); 

113 } 

114 } 
115 

1 16 angle = -fineDirections/2*step + coarseAngle; 

1 17 maxAngle = angle; 

118 maxValue = counters[0]; 

119 for(i^O;i<fineDirections; + + i,angle += step) { 

120 /* printf(' , %3.3f:%3.3f\n ,, ,angIe,counters[i]); */ 

121 if (counters[i]>maxValue){ 

122 maxAngle = angle; 

123 maxValue = counters[i]; 

124 } 

125 }' 
126 

1 27 /* Plot the orientation graph if requested */ 

128 angle = -fineDirections/2*step + coarseAngle; 

129 if (angle < 0) 

130 angle +- 2*M PI; 

131 if (plotFile» = NULL){ 

132 printfC Opening fine orientation plot file\n w ); 

133 if ((outfile = fopen(plotFile/V)) = = NULL) { 

134 printf{" Error opening fine orientation plot fileAn"); 

135 exit{-1); 

136 } 

137 for (i = 0;i<fineDirections; + + i, angle += step) 

138 fprintf(outfile/'°/of %f\n u ( fmod(angle ( 2*M_PI),counters[i]); 

139 fprintf(outfile,*VFine DistancesVn\n M ); 

140 fclose(outfile); 

141 printf("Done writing fine orientation plot fileAn"); 

142 } 
143 

144 

145 return maxAngle; 

146 } 
147 

148 float NewFine(Picture pictjnt fineSamples, int fineDirections, 

149 float angleStart,fIoat angleEnd, char *plotFile) 

150 { 

151 intx,y; 

152 float x2,y2; 

153 int tj; 

154 float *counters; 

155 float step.angle; 

156 float maxAngle; 

157 float maxValue; 

158 float maxLength; 

159 FILE *outfile; 
160 

161 counters = (float *)calloc(fineDirections,sizeof(float)); 

162 if (counters = = NULL) { 
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163 printfC'Fine: cannot allocate memory\n n ); 

164 exit(-1); 

165 } 
166 

167 step = ABS(angleEnd - angleStart)/fineDirections; 
168 

169 maxlength = sqrt((double)(pict-> width*pict-> width + 

170 pict->height*pict->height)); 

171 f or (i = 0; i <f ineSamples; + + i) { 

172 RandomEdgePixel(pict&x,&y); 

173 angle = angleStart; 

1 74 for (j = 0; j < f ineDirections; + + j) { 

175 angle = fmod(angle,2*MJ>l); 

176 x2 = x + maxLength*cos(angle); 

177 y2 = y + maxLength*sin(angle); 

178 counters[j] + = CountLine(pict f x,y J (int)x2,(int)y2); 

179 angle + = step; 

180 } 

181 } 
182 

183 angle = angleStart; 

184 maxAngle = angle; 

185 maxValue = counters[0]; 

186 for(i=0;i<fineDirections; + + i){ 

1 87 angle = f mod{angle,2*M_PI); 

1 88 if (counters[i]> maxValueH 

189 maxAngle = angle; 

190 maxValue = counters[i]; 

191 } 

192 angle + = step; 

193 } 

194 printf( M Orientationisat%f(%f)\n" ( maxAngle ( maxAngle/2/M Pl*360); 
195 

196 /* Plot the orientation graph if requested */ 

197 if (pIotFUe) { 

198 printf("Opening fine orientation plot f ile\n M ); 

199 if ((outfile = fopen(plotFile/w"))= = NULL){ 

200 printf ("Error opening fine orientation plot fileAn w ); 

201 exit(-1); 

202 } 

203 angle = angleStart; 

204 for (i=0;i<f ineDirections; + + i) { 

205 angle = fmod(angie f 2*M_PI); 

206 fprintf(outfile/%f %f\n M ,angle,counters[i]); 

207 angle + = step; 

208 } 

209 fprintf(outfile, VFine Distances\n\n"); 

210 fclose(outfile); 

211 printf ("Done writing fine orientation plot fileAn M ); 

212 } 

213 return maxAngle; 

214 } 
215 
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1 #include <stdio.h> 

2 #include <math.h> 

3 #indude "booiean.h" 

4 #include "pict.h" 
5 

6 main(argc, argv) 

7 int argc; 

8 char *argv[]; 

9 { 

10 char *inFile1 J *inFile2,*outFile; 

1 1 Picture pict1,pict2,fina1Prct; 

12 ColorMapcmap; 

13 intx,y; 
14 

15 if (argc! =4) 

16 { 

17 printf("\nUsage: %s infilel infile2 outfileVn\n M , 

18 argv[0]); 

19 exit(O); 

20 } 
21 

22 tnFMel = argv[1]; /* get args */ 

23 inFile2 = argv[2); 

24 outFile = argv[3]; 
25 

26 pictl = load_pict(inFile1); 

27 pict2 = load_pict(inFile2); 

28 if ((pict1->depthi = 1) || (pict2->depth l = 1)) 

29 DoErrorCoverlay; only depth 1 supported.\n" ( NULL); 

30 if ((pict1-> width != pict2->width)||(pict1->height ! = pict2-> height)) 

3 1 DoErrorCoverlay: images must be the same size\n M ,NULL); 
32 

33 finalPict = newjDirt(pirt1->width,pict1->height,8); 

34 cmap = NewColorMap(3); 

35 WriteColorValue<cmap,0 ( 0,0,0); /* Black */ 

36 WriteColorValue(cmap,1,0,128 ( 0); /* Olive */ 

37 WriteColorValue(cmap J 2 f 0,255,0); /* Green *f 

38 finaIPict->cmap = cmap; 
39 

40 for(y=0;y<pict1->height;+ +y) 

41 for (x = 0;x<pict1-> width; + + x) 

42 if (ReadPixel(pict1,x,y)) 

43 WritePixel(finalPict,x,y,2); 

44 else if (ReadPixeKpia^Xry)) 

45 WritePixel(finaIPict ( x,y,1); 
46 

47 write pict(outFile,finalPict); 

48 } 
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Jul 1 13:45 1991 pagestats.c 

1 #include <stdio.h> 

2 #include <math.h> 

3 #include "boolean.h" 

4 #indude "types.h w 

5 #include "error.h" 

6 #include -pictlT 

7 ^ndude^dict-h" 
8 

9 #def ine UP 0 

10 #def ine DOWN 1 

1 1 typedef int Direction; 
12 

13 extern Picture thePict; 
14 

15 void StoreRawOutlinePair(Dictionary diet, int dictEntry, 

16 Box box,int*bothX,int *topYJnt*baseY, 

17 int numberOf Legs) 

18 { 

19 RawOutlinePairtemp; 

20 int i; 

21 int*xCursor,*topCursor,*bottomCursor; 
22 

23 temp = (RawOutlinePair)calloc(1,sizeof(RawOutlinePairBody)); 

24 if (temp == NULL) 

25 DoErrorfStoreRawOutlinePair: cannot allocate space\n",NULL); 
26 

27 temp- > box = box; 

28 temp->numberOfLegs = numberOfLegs; 
29 

30 temp->x = (int *)cailoc(temp-> numberOf Legs,sizeof(int)); 

31 temp->top = (int*)calloc(temp->numberOfLegs,sizeof(int)); 

32 temp->bottom = (int *)calloc(temp->numberOfLegs,sizeof(int)); 

33 if ((temp- >x = = NULL) || 

34 (temp->top = = NULL) H 

35 (temp->bottom = = NULL)) 

36 DoError("StoreRawOutlinePair: cannot allocate space\n",NULL); 
37 

38 xCursor = temp->x; 

39 topCursor = temp->top; 

40 bottomCursor = temp-> bottom; 
41 

42 for(i = 0;i<numberOfLegs; + -H){ 

43 *xCursor++ = *bothX+ + ; 

44 *topCursor+ + = *topY+ +; 

45 *bottomCursor+ + = *baseY+ +; 

46 } 

47 *(dict->rawOutlines+dictEntry) = temp; 

48 } 
49 

50 void StoreOutlinePair(Dictionary diet, int dictEntry, 

51 int middleLine,int fontXHeight) 

52 { 
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53 RawOutlinePair raw; 

54 OutlinePairtemp; 

55 int i,numberOf Legs; 

56 int y; 

57 int offset; 

58 int *xSCursor,*topSCursor,*bottomSCursor; 

59 float *xDCursor,*topDCuTsor,*bottomDCursor; 

60 float *xCursor,*topCursor/bottomCursor; 

61 intleft,right; 

62 float foffset; 
63 

64 raw = *(dict->rawOutlines+dictEntry); 
65 

66 temp = (OutlinePair)calioc(1,sizeof(OutlinePairBody)); 

67 if (temp = = NULL) 

68 DoErrorfStoreOutlinePair: cannot allocate space\n',NULL); 
69 

70 temp->x = (float *)calloc(raw->numberOfLegs,sizeof(float)); 

71 temp- > top = (float *)calloc(raw->numberOfLegs,sizeof (float)); 

72 temp->bottom = (float ^callocfrawonumberOfLegs^izeof (float)); 

73 if ((temp->x = = NULL) || 

74 (temp->top = = NULL) || 

75 (temp->bottom = = NULL)) 

76 DoError("StoreOutlinePair: cannot allocate space\n*,NULL); 
77 

78 temp- > box = raw- > box; 

79 temp->blackoutHeight = 0; 

80 temp- >numberOf Legs = raw- >numberOf Legs; 

81 offset = temp- > offset = *(raw->x); 

82 temp->width = *(raw- >x+ raw- >numberOf Legs- 1) - temp-> offset; 
83 

84 xDCursor = temp->x; 

85 topDCursor = temp->top; 

86 bottomDCursor = temp- > bottom; 

87 xSCursor = raw->x; 

88 topSCursor = raw- > top; 

89 bottomSCursor = raw- > bottom; 
90 

91 numberOfLegs = raw->numberOfLegs; 

92 f or (i = 0; t < numberOfLegs; + + i) { 

93 *xDCursor+ + = (float)(*xSCursor+ + - offset)/fontXHeight; 

94 y = middleLine - *topSCursor+ +; 

95 if(y<0) 

96 y = 0; 

97 *topDCursor+ + = (float)y / fontXHeight; 

98 y = *bottomSCursor+ + - middieLine; 

99 if(y<0) 

100 y=0; 

101 *bottomDCursor+ + = (float)y /fontXHeight; 

102 } 
103 

1 04 /* Now try to remove parts of the contour on to the left and right of the 

105 * word shape that are at height 0 */ 
106 

107 topDCursor = temp- > top; 
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108 bottomDCursor = temp->bottom; 

1 09 f or (i = 0; i < numberOf Leg s; + + i) { 

110 if ((*topDCursor+ + != 0)||(*bottomDCursor+ + !-0)) 

1 1 1 break; 

112 } 

113 left = i; 
114 

115 topDCursor = temp- >top+ numberOf Legs-1; 

1 16 bottomDCursor = temp- > bottom + numberOf Legs- 1; 

1 17 for (i = numberOf Legs-1;i> =0;--i) { 

1 18 if ((*topDCursor- != 0)H(* bottomDCursor- ! = 0)) 

119 break; 

120 } 

121 right = 
122 

123 xDCursor = temp->x; 

1 24 topDCursor = temp- > top; 

125 bottomDCursor = temp- > bottom; 

1 26 xCursor = temp- >xH- left; 

127 topCursor = temp- > top + left; 

128 bottomCursor = temp- > bottom + left; 

129 f offset = *xSCursor; 

1 30 for (i s left; i < rig ht; + + i) { 

131 *xDCursor+ + = *xCursor+ + -foffset; 

132 *topDCursor+ + = *top Cursor + +; 

133 *bottomDCursor+ + = *bottomCursor++; 

134 } 

135 temp-> numberOf Legs = right-left; 
136 

137 *(dict-> outlines + dictEntry) = temp; 

138 } 
139 

140 static int lineSpacing; 

1 41 int OrderOutlinePair(OutlinePair *o1 # OutlinePair *o2) 

142 { 

143 int y Distance; 

144 intxDistance; 

145 yDistance = (*o1)->box->pageY- (*o2)->box->pageY; 

146 if (yDistance < lineSpacing && yDistance > -lineSpacing) { 

147 xDistance = (*o1)->box->pageX- (*o2)->box->pageX; 

148 return xDistance; 

149 } 

150 return yDistance; 

151 } 
152 

1 53 void SortDictionary(Dictionary diet) 

154 { 

155 lineSpacing = 20; 

156 qsort(dict->rawOutlines # dict-> numberOf Entries,sizeof(RawOutlinePair), 

157 OrderOutlinePair); 

158 } 
159 

160 #def ine HIST.SIZE 100 

161 void Histogram(int *data, int dataLength, int offset, int *histogram) 

162 { 
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163 inti,bin; 

1 64 f or (i = 0; i < dataLength; + + i) { 

165 bin = *data-offset; 

1 66 if ((bin > = 0)&&(bin < HIST.SIZE)) 

167 histogram[bin] + + ; 

168 data-f+; 

169 } 

170 } 
171 

172 void HistogramPeaks(int *data f int dataLength, int offset, int *histogram) 

173 { 

174 inti,bin; 

175 Direction direction; 
176 

177 if (*(data+l) < *data) 

178 direction = UP; 

179 else{ 

180 bin = *data-offset; 

181 if ((bin>=0)&&(bin<HIST_SIZE)) 

182 histogramIbin] + -f; 

183 direaion = DOWN; 

184 } 

185 ++data; 
186 

1 87 for (i = 1 ; i < dataLeng th-1 ; + + i) { 

188 if ((direction == UP)&& 

189 (*data < *(data + !))){ 

190 /* *dataisapeak*/ 

191 bin = *data-offset; 

192 if ((bin> = 0)&&(bin<HIST_SlZE)) 

193 histogram[bin]+ +; 

194 direction = DOWN; 

195 ) 

196 else if ((direction = = DOWN) && 

197 (*data > *(data+1))){ 

198 /* *data isa valley */ 

199 direction = UP; 

200 } 

201 + +data; 

202 }/*fori*/ 

203 } 
204 

205 void HistogramValIeys(int *data,tnt dataLength; int offset int *histogram) 

206 { 

207 int i, bin; 

208 Direction direction; 
209 

210 if (*{data+1) > *data) 

211 direction = UP; 

212 else{ 

213 bin = *data-offset; 

214 if «bin> = 0)&&(bin < HIST_SIZE)> 

215 histogram[bin]+ +; 

216 direction = DOWN; 

217 } 
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218 -f+data; 
219 

220 f or (i = 1 ; i < dataLength- 1 ; + + i) { 

221 indirection — UP) && 

222 (*data > *(data + !))){ 

223 /**dataisapeak*/ 

224 bin = *data-offset; 

225 if ((bin> = 0)&&(bin<HiST_S!ZE)) 

226 . histogram[bin]+ +; 

227 direction = DOWN; 

228 } 

229 else if ({direction = = DOWN) && 

230 <*data < *(data + !))){ 

231 /**dataisavalley*/ 

232 direction = UP; 

233 } 

234 ++data; 

235 } /* for i */ 

236 } 
237 

238 int MaxBin(int *histogram) 

239 { 

240 inti; 

241 intmaxValue; 

242 int maxlndex; 
243 

244 maxValue = histogram(O); 

245 maxlndex = 0; 

246 for(i = 0;i<HIST_S1ZE; + +i) 

247 if (histogram[i]>maxValue) { 

248 maxValue = histogram[i]; 

249 maxlndex = i; 

250 } 

251 return maxlndex; 

252 } 
253 

254 void PostProcess(Dictionary diet) 

255 { 

256 int index; 

257 int temp; 

258 int LstartlndeXffirst^minY.endlndex^hape; 

259 inttopsIHISTJIZEJ; 

260 int bottoms[HIST_SIZE]; 

261 int middleLine,topLine,bottomLine; 

262 intfontXHeight; 

263 RawOutlinePair thisShape; 
264 

265 SortDictionary(dirt); 
266 

267 index = 0; 

268 #ifdeffoo 

269 malloc verifyO; 

270 #endif 

271 while (index < dict-> numberOf Entries) { 

272 startlndex = index; 
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273 firstY = (*(dict->rawOutlines+index))->box->pageY; 

274 minY = firstY; 

275 while((*{dtct->rawOutlines + index))->box->pageY-fir$tY < 20 && 

276 (*(dict->rawOutlines+index))->box->pageY-firstY > -20){ 

277 if (minY > ((*(dict->rawOutlines+ index))- > box- >pageY)) 

278 minY = (*(dict->rawOutlines+index))->box->pageY; 

279 + + index; 

280 if (index == diet- >numberOf Entries) 

281 break; 

282 } 

283 endlndex = index; 
284 

285 #ifdeffoo 

286 malloc verifyO; 

287 #endif 
288 

289 /* shapes from start index through endindex are all on */ 

290 /* the same text line*/ 

291 /* minY hasthe top of the highest box on the line. */ 
292 

293 /* Find the base and toplines by taking the mode of the heights of the 

294 * valleys of the bottom contours and the peaks of the top contours */ 

295 for (i = 0; i < HIST_SIZE;i + 4 ) { 

296 tops[i] = 0; 

297 bottoms[i] = 0; 

298 } 

299 for (shape=startlndex;shape<endlndex; + + shape) { 

300 thisShape = *(dict->rawOutlines+ shape); 

301 Histogram(thisShape->top,thisShape->numberOfLegs J minY,tops); 

302 Histogram(thisShape-> bottom,thisShape- > numberOfLegs f minY,bottoms); 
303 

304 #ifdef foo 

305 HistogramPeaks(thisShape->top,thisShape->numberOfLegs,minY r tops); 

306 HistogramValleysfthisShape^bottom^thisShape^numberOfLegs^min^bottoms); 

307 #endif 

308 } 

309 topline = MaxBin(tops) + minY; 

310 bottomLine = MaxBin(bottoms) + minY; 
311 

312 if (thePict) { 

313 int maxLength; 

314 inthalfWidth; 

315 intx ( y; 

316 float x2,x3,y2,y3; 

317 float angle; 
318 

319 angle = (*(dict->rawOutlines))->box->angle; 

320 maxLength = thePict-> width + thePict-> height; 

321 harfWidth = thePict-> width / 2; 

322 x = topLine * -sin(angle) + halfWidth * cos(angle); 

323 y = topLine * cos(angle) + halfWidth * sin(angle); 

324 x2 = x+maxLength*cos(angle); 

325 y2 = y + maxlength*sin(angle); 

326 x3 = x-maxLength*cos(angte); 

327 y3 = y-maxLength*sin(angle); 
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328 DrawLine(thePict,x,y.(int)x2,(int)y2,5); 

329 DrawLine(thePictxy.(int)x3,(int)y3,5); 
330 

331 x = bottomLine * -sin(angle) + halfWidth * cos(angle); 

332 y = bottomLine * cos(angle) + halfWidth * sin(angle); 

333 x2 = x+maxLength*cos(angle); 

334 y2 = y + maxlength*sin(ang!e); 

335 x3 = x-maxLength*cos(angle); 

336 y3 = y-maxLength*$in(angle); 

337 DrawLine(thePict,x f y,<int)x2.(int)y2,5); 

338 DrawLinetthePict^^intJxSXintJyS.S); 
339 

340 } 
341 

342 #ifdeffoo 

343 malioc_verifyO; 

344 #endif 
345 

346 middleLine = (bottomLine-f topLine)/2; 

347 fontXHeight = bottomLine-topLine; 

348 /* Clip and normalize the contours */ 

349 for (shape=startlndex;shape<endlndex; + +shape) 

350 StoreOutlinePairtdict^hape^iddleLincfontXHeight); 

351 } /* Do another line of text */ 

352 } 
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Jul t 13:46 1991 postproc.c 



1 #include <stdio.h> 

2 #include <math.h> 

3 #include "boolean.h" 

4 #include "types.h" 

5 #include "error.h" 

6 #include "picth" 

7 ^clude'dict-h" 
8 

9 #def ine UP 0 

10 #def ine DOWN 1 

1 1 typedef int Direction; 
12 

13 extern Picture thePict; 
14 

15 void StoreRawOutlinePair(Dictionarydict, int dictEntry, 

16 Box boxjnt *bothX,int *topY, int *baseY, 

17 int numberOfLegs) 

18 { 

19 RawOutlinePairtemp; 

20 int i; 

21 int*xCursor/topCursor/bottomCursor; 
22 

23 temp = (RawOutlinePair)calloc(1,sizeof(RawOutlinePairBody)); 

24 if (temp == NULL) 

25 DoError( M StoreRawOutlinePair: cannot allocate space\n",NULL); 
26 

27 temp- > box = box; 

28 temp- > numberOfLegs = numberOfLegs; 
29 

30 temp->x = (int *)caIloc(temp->numberOfLegs,sizeof(int)); 

31 temp->top = (int *)calloc(temp->numberOfLegs,sizeof(int)); 

32 temp->bottom = (int *)calloc(temp->numberOfLegs,sizeof(int)); 

33 if ((temp->x = = NULL) || 

34 (temp->top = = NULL) || 

35 (temp->bottom = = NULL)) 

36 DoErrorCStoreRawOutlinePair: cannot allocate spaceW.NULL); 
37 

38 xCursor = temp->x; 

39 topCursor = temp->top; 

40 bottomCursor = temp-> bottom; 
41 

42 f o r (i = 0; i < n u mberOf Legs; + + i) { 

43 *xCursor+ + = *bothX+ + ; 

44 *topCursor+ + = *topY+ + ; 

45 *bottomCursor++ = *baseY++; 

46 } 

47 *(dict->rawOutlines + dictEntry) = temp; 

48 } 
49 

50 void StoreOutlinePair(Dictionary diet int dictEntry, 

51 int middleLine,int fontXHeight) 

52 { 
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53 RawOutlinePair raw; 

54 OutlinePairtemp; 

55 int i,numberOf Legs; 

56 int y; 

57 int offset; 

58 int *xSCursor,*topSCursor,*bottomSCursor; 

59 float *xDCursor,*topDCursor,*bottomDCursor; 

60 float *xCursor,*topCursor,*bottomCursor; 

61 intleft,right; 

62 float foffset; 
63 

64 raw = *(dict->rawOutlines+dictEntry); 
65 

66 temp = (OutlinePair)calloc(1,sizeof(OutlinePairBody)); 

67 if (temp == NULL) 

68 Do Error(" St ore Outline Pair: cannot allocate space\n M , NULL); 
69 

70 temp->x = (float *)ca!loc(raw>numberOfLegs,sizeof (float)); 

71 temp->top = (float *)calloe(raw->numberOfLegs,sizeof(float)); 

72 temp->bottom = (float *)calloc(raw-> numberOfLegs,sizeof(float)); 

73 jf((temp->x = = NULL) || 

74 (temp->top = = NULL) || 

75 (temp->bottom = = NULL)) 

76 DoErrorfStoreOutlinePair: cannot a llocatespace\n u ,NULL); 
77 

78 temp- >box = raw- > box; 

79 temp->blackoutHeight = 0; 

80 temp- >numberOf Legs = raw->numberOfLegs; 

81 offset = temp-> offset = *(raw->x); 

82 temp->width = *{raw->x+raw->numberOfLegs-1) -temp-> offset; 
83 

84 xOCursor = temp->x; 

85 topDCursor = temp->top; 

86 bottomDCursor = temp->bottom; 

87 xSCu rsor = raw- > x; 

88 topSCursor = raw->top; 

89 bottomSCursor = raw- > bottom; 
90 

91 numberOfLegs - raw->numberOfLegs; 

92 for (i = 0;i< numberOfLegs; + + i) { 

93 *xDCursor+ + = (f loat)(*xSCur$or+ + - offset)/fontXHeight; 

94 y = middleLine-*topSCursor+ + ; 

95 if(y<0) 

96 y = 0; . 

97 * topDCursor + + = (float)y / fontXHeight; 

98 y = * bottomSCursor + + - mlddleLine; 

99 if (y<0) 

100 y = 0; 

101 *bottomDCursor++ = (float)y / fontXHeight; 

102 } 
103 

1 04 /* Now try to remove parts of the contour on to the left and right of the 

1 05 * word shape that are at height 0 */ 
106 

107 topDCursor = temp->top; 
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108 bottomDCursor = temp- > bottom; 

1 09 f or (i = 0; i < numberOf Legs; + + i) { 

110 if ((*topDCursor+ + != 0)||(M>ottomDCursor+ + ! = 0)) 

111 break; 

112 } 

113 left = i; 
114 

1 15 topDCursor = temp- > top + numberOf Legs-1; 

1 16 bottomDCursor = te mp-> bottom + numberOf Legs- 1; 

117 f or (i = numberOf Legs-1 ; i > = 0;--i) { 

118 if (('topDCursor- 1= 0)||(* bottomDCursor- ! = 0)) 

119 break; 

120 } 

121 right - i+1; 
122 

123 xDCursor = temp->x; 

124 topDCursor = temp->top; 

125 bottomDCursor = temp- > bottom; 

126 xCursor = temp->x+left; 

127 topCursor = temp- > top + left; 

128 bottomCursor = temp->bottom + left; 

129 f offset = *xSCursor; 

130 for (i = left;i< right; + 

131 * xDCu rsor + + = *xCursor + + - f off set; 

132 *topDCursor++ = * topCursor + + ; 

133 'bottomDCursor + + = *bottomCursor+ + ; 

134 } 

135 temp- > numberOf Legs = right-left; 
136 

137 *(dict->outlines+dictEntry) = temp; 

138 } 
139 

140 static int lineSpacing; 

141 int OrderOutlinePair(OutlinePair *o1,OutlinePair *o2) 

142 { 

143 intyDistance; 

144 int xDistance; 

145 yDistance = (*o1)->box->pageY- (* o2)-> box- > page Y; 

146 if (yDistance< lineSpacing && yDistance > -lineSpacing) { 

147 xDistance = (*o1)->box->pageX - (*o2)->box->pageX; 

148 return xDistance; 

149 } 

150 return yDistance; 

151 } 
152 

1 53 void SortDictionary(Dictionary diet) 

154 { 

155 lineSpacing ~ 20; 

156 qsort(dict->rawOutlines # dict-> numberOf Entries,sizeof(RawOutlinePair), 

157 OrderOutlinePair); 

158 } 
159 

160 #define HISTSIZE 100 

161 void HistogramMax(int *data f int dataLengthjnt offsetjnt signjnt *histogram) 

162 { 
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163 inti,bin; 
164 

165 if(sign>0){ 

166 int maxVaiue; 

167 maxVaiue = *data; 

1 68 for (i = 0; i <data Len gth; + + i) 

1 69 if (data[i] > maxVaiue) 

170 maxVaiue = datafi]; 

171 bin = maxValue-offset; 

172 if ((bin> = 0)&&(bin <HIST_SIZE)) 

173 histogramIbin]+ + ; 

174 } 

175 else{ 

176 intminValue; 

177 minValue = *data; 

1 78 for (i = 0; i < dataLength; + + i) 

179 if (data[ij< minValue) 

180 minValue = datalij; 

181 bin = minValue-offset; 

182 if((bin>=0)&&(bin<HIST_$IZE)) 

183 histogram[binl++; 

184 } 

185 } 
186 

187 void Histogram(int *data,int dataLength, int offset, int *histogram) 

188 { 

189 int i.brn; 
190 

191 f or (i = 0; i < dataLength; + + i) { 

192 bin = *data-offset; 

1 93 if ((bin > = 0)$ &(bin < H 1ST J\2E)) 

194 histogram[bin] + +; 

195 data+ + ; 

196 } 

197 } 
198 

199 void HistogramPeaks(tnt *data l int dataLength, int offset, int *histoqram) 

200 { 

201 inti,bin; 

202 Direction direction; 
203 

204 if(*(data+1)< *data) 

205 direction = UP; 

206 else{ 

207 bin = *data-offset; 

208 if ((bin > = 0)&&(bin < HIST JI2E)) 

209 histogram[bin] + + ; 

210 direction = DOWN; 

211 J 

212 ++data; 
213 

214 f or (i = 1 ; i < dataLength-1 ; + + i) { 

215 if ((direction == UP) && 

216 (*data < *(data +l))){ 

217 /**dataisapeak*/ 
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218 bin = *data-offset; 

219 if ((bin > = 0)&&(bin < HISTSIZE)) 

220 histogram[bin]+ + ; 

221 direction = DOWN; 

222 } 

223 else if ((direction = = DOWN) && 

224 (*data > *(data+1))){ 

225 /**dataisavalley*/ 

226 direction = UP; 

227 } 

228 + + data; 

229 } /* for i */ 

230 } 
231 

232 void HistogramVa!leys(int *data,int dataLength, int offset, int *histogram) 

233 { 

234 int i,bin; 

235 Direction direction; 
236 

237 if (*(data+1) > *data) 

238 direction = UP; 

239 else{ 

240 bin = *data-offset; 

241 if ((bin > = 0)&&(bin < HIST_5IZE)) 

242 histogram[bin] + +; 

243 direction = DOWN; 

244 } 

245 + +data; 
246 

247 f or (i = 1 ; i < dataLength- 1 ; + + i) { 

248 if ((direction == UP) && 

249 (*data > *(data + !))){ 

250 /* *data is a peak */ 

251 bin = *data-offset; 

252 if ((bin>=0)&&(bin<HISTJIZE)) 

253 histogramlbinj-f + ; 

254 direction = DOWN; 

255 } 

256 else if ((direction = = DOWN) && 

257 (*data < *(data+1))){ 

258 /* *dataisavalley*/ 

259 direction = UP; 

260 }. 

261 + + data; 

262 }/Mori*/ 

263 } 
264 

265 int MaxBin(int ^histogram) 

266 { 

267 int i; 

268 int maxValue; 

269 intmaxlndex; 
270 

271 maxValue = histogram[0]; 

272 maxlndex = 0; 
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273 f or (i = 0; i < HIST.SIZE; + + i) 

274 if (histogram^] > maxValue) { 

275 maxValue = histogramN); 

276 maxlndex = i; 

277 } 

278 return maxlndex; 

279 } 
280 

281 void PostProcess(Dictionary diet) 

282 { 

283 int index; 

284 int temp; 

285 int Utartlndex^rstY^inY.endlndex^hape; 

286 inttops[HIST_SIZE]; 

287 int bottoms[HIST_SfZE]; 

288 int middleLine ( topLine,bottomLine; 

289 intfontXHeight; 

290 RawOutlinePair thisShape; 
291 

292 SortDictionary(dict); 
293 

294 index = 0; 

295 #ifdeffoo 

296 malloc verifyO; 

297 #endif~ 

298 while (index < dict-> numberOf Entries) { 

299 startlndex = index; 

300 firstY = (*(d ict->rawO utlines+ index))- > box- > pa geY; 

301 minY = firstY; 

302 while ( (*(dict->rawOutlines+ index))- > box- >pageY- firstY < 20 && 

303 (*(dict->rawOutlines + index))->box->pageY-firstY > -20) { 

304 if {minY > ((*(dict->rawOutlines+index))->box->pageY)) 

305 minY = (*(d ict-> raw Outlines-*- index))- > box- > pa geY; 

306 ++ index; 

307 if (index = = dict-> numberOf Entries) 

308 break; 

309 } 

310 endlndex = index; 
311 

312 #ifdeffoo 

313 malloc verifyO; 

314 #endif ~ 
315 

316 /* shapes from start index through endindexareall on */ 

317 /* the same text line */ 

318 /* minY has the top of the highest box on the line. */ 
319 

320 /* Find the base and toplines by taking the mode of the heights of the 

321 * valleys of the bottom contours and the peaks of the top contours */ 

322 for (i = 0; i < HIST.SIZE; i + + ) { 

323 tops[i]=0; 

324 bottoms[i] = 0; 

325 } 

326 for (shape=startlndex;shape<endlndex;+ +shape) { 

327 thisShape = *(dict->rawOutlines+ shape); 
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328 Histogram(thisShape->top,thisShape->numberOfLegs,minY,tops); 

329 Histogram(thisShape->bottom / thisShape->numberOfLegs l minY f bottoms); 
330 

331 #ifdeffoo 

332 HistogramPeaks(thlsShape->top,thisShape->numberOfLegs i minY / tops); 

333 HistogramValieys(thisShape->bottom l thisShape->numberOfLegs,minY y bottoms); 

334 #endif 

335 } 

336 topLine = MaxBin(tops)+minY; 

337 bottomLine = MaxBin(bottoms) + minY; 
338 

339 if(thePict){ 

340 int maxLength; 

341 int halfWidth; 

342 intx.y; 

343 float x2,x3,y2,y3; 

344 float angle; 
345 

346 angle - (*(dict->rawOutlines))-> box- > angle; 

347 maxLength = thePict->width + thePict-> height; 

348 halfWidth = thePict->width / 2; 

349 x = topLine * -sin(angle) 4- halfWidth * cos(angle); 

350 y = topLine * cos(angle) + halfWidth * sin(angte); 

351 x2 = x + maxLength* cos(angle); 

352 y2 = y+maxLength*sin(angle>; 

353 x3 = x-maxLength*cos(angle); 

354 y3 = y-maxLength*sin(angle); 

355 DrawLine(thePict,x,y,(int)x2,(int)y2,5); 

356 DrawLine(thePict,x,y,(int)x3,(int)y3 ( 5); 
357 

358 x = bottomLine * -sin(angle) + halfWidth * cos(angle); 

359 y = bottomLine * cos(angle) + halfWidth * sin(angle); 

360 x2 = x+maxLength*cos(angle); 

361 y2 = y-f maxLength*sin(angle); 

362 x3 = x-maxLength*cos(angle); 

363 y3 = y-maxLength*sin(angle); 

364 DrawLtne(thePict,x,y^int)x2,(int)y2,5); 

365 DrawLinetthePict^y^int^OntJyS^S); 
366 

367 } 
368 

369 #ifdeffoo 

370 malloc_verifyO; 

371 #endif 
372 

373 middleLine = (bottomLine + topLine)/2; 

374 fontXHeight = bottomLine-topLine; 

375 /* Clip and normalize the contours */ 

376 . for (shape = startlndex;shape<endlndex; + +shape) 

377 StoreOutlinePaiKdirt^hape^middleLine^fontXHeight); 

378 ) /* Do another line of text */ 

379 } 
380 

381 void PageStati$tics(Dictionary dict,char *f ileName) 

382 /* WARNING - this must be run before PostProcess since PostProcess changes the raw 
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383 * shape data, */ 

384 { 

385 int index; 

386 int temp; 

387 inti,startlndex,firstY,minY,endlndex,shape; 

388 inttops[HIST_SIZE]; 

389 intbottomslHIST SIZE]; 

390 intascendersjHIST SIZE]; 

391 int descenders(HIST_SIZE]; 

392 int middleLine,topLine,bottomLine,ascenderLine,destenderLine; 

393 intascenderHeight,descenderHeight,!ineNumber; 

394 intfontXHeight; 

395 RawOutlinePair thisShape; 

396 FILE *fp; 
397 

398 if((fp=fopen(fileName ( u w"))==NULL) 

399 DoErrorCPageStattstics: error opening output file %s.\n\fileName); 
400 

401 SortDictionary(dict); 
402 

403 index = 0; 

404 #ifdef foo 

405 malloc_verify{); 

406 #endif ~ 

407 lineNumber = 0; 

408 while (index < dict->numberOf Entries) { 

409 startlndex = index; 

410 firstY = (*(dict->rawOutlines + index))->box->pageY; 

411 minY = firstY; 

412 while ( (*(dict->rawOutl ines + index))- > box- >pageY- firstY < 20 && 

4 1 3 (*(d ict- > ra wOutlines + index))- > box- > pageY - firstY > -20) { 

414 if (minY >( (*(dict->rawOutlines+ index))- > box- >pageY)) 

415 minY = (*(dict->rawOutlines+index))->box->pageY; 

416 ++ index; 

417 if (index = = dict->numberOfEntries) 

418 break; 

419 } 

420 endlndex = index; 
421 

422 #ifdeffoo 

423 malloc verifyO; 

424 #endif 
425 

426 /* shapes from start index through endindex are all on */ 

427 /* the same text line V 

428 /* minY has the top of the highest box on the line. */ 
429 

430 /* Find the base and topiines by taking the mode of the heights of the 

431 * valleys of the bottom contours and the peaks of the top contours */ 

432 for(i = 0;i<HIST_SIZE;i++){ 

433 tops[i]=0; 

434 bottoms[i) = 0; 

435 ascenders[i] = 0; 

436 descenders[i] = 0; 

437 } 



10/24/2003, EAST Version: 1.4.1 



5,491,760 



563 



564 



Section D APPENDIX / Page 263 



438 for (shape = sta rtlndex;shape < endlndex; + + shape) { 

439 thisShape = *(dict->rawOutlines+ shape); 

440 Histogram(thisShape->top ( thisShape->numberOfLegs,minY,tops); 

441 Histogram(thisShape->bottom^hisShape->numberOfLegs # minY,bottoms); 
442 

443 HistogramMax(thisShape->top,thrsShape->numberOfLegs,rnlnY r 1 ( ascenders); 

444 HistogramMax(thisShape->bottorn,thisShape->numberOfLegs,rninYJ, descenders); 

445 } 

446 topLine = MaxBin(tops) + minY; 

447 bottomLine = MaxBin(bottoms) + minY; 

448 ascenderLine = MaxBin(ascenders) + minY; 

449 descenderLine = MaxBin(descenders) + minY; 
450 

451 #ifdeffoo 

452 malloc^verifyO; 

453 #endif 
454 

455 middleLine = (bottomLine + topLine)/2; 

456 fontXHeight = bottom Line-top Line; 
457 

458 ascenderHeight = bottomLine-ascenderline; 

459 descenderHeight = descenderLine-bottomLine; 

460 fprintf(fp, M %d: %d %d %d 
%2.6f\n",lineNumber # fontXHeight,ascenderHeight,descenderHeight, 

461 (float)ascenderHeight/(float)fontXHeight); 

462 + + lineNumber; 

463 } /* Do another line of text */ 

464 fclose(fp); 

465 } 
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Jul 1013:17 1991 testFine.c 



1 #include <stdio.h> 

2 #indude <math.h> 

3 #indude "booiean.h" 

4 #inctude "picth" 

5 #indude M lines.h u 
6 

7 #define ABS(x) «{x)<0)?(-(x)):(x)) 
8 

9 extern long randomO; 
10 

1 1 int RandomCoordinate(int maxValue) 

12 { 

13 return (float)(randomO&Oxffff)*maxValue/Oxffff; 

14 . } 
15 

16 void RandomEdgePixel(Picture pictjnt *x, int *y) 

17 { 

18 while (TRUE) { 

19 *x = RandomCoordinate(pict->width); 

20 *y = RandomCoordinate(pict-> height); 

21 if (ReadPixel(pict,*x,*y)) 

22 if (J<ReadPixel(pict,*x+1/ k y)&& 

23 ReadPixel(pict,*x-1 ( *y) && 

24 ReadPixel(pict,*x,*y+ 1) && 

25 ReadPixel(pict,*x,*y-1) && 

26 ReadPixel(pict,*x+1,*y+1)&& 

27 ReadPixel(pict,*x-1 ( *y-1) && 

28 ReadPixel(pict,*x+ 1,*y-1) && 

29 ReadPixel(pict,*x-1,*y + 1))) 

30 return; 

31 } 
32 

33 } 
34 

35 float Fine(Picture pictjnt f ineSamples, int f ineDirections, 

36 float angleStart,f loat angleEnd, char *piotFile) 

37 { 

38 intx,y; 

39 float x2 ( y2; 

40 int ij; 

41 float *counters; 

42 float step,angle; 

43 float maxAngle; 

44 float maxValue; 

45 float maxLength; 

46 FILE *outfile; 
47 

48 counters = (float *)calloc(fineDirections J sizeof (float)); 

49 if (counters = = NULL) { 

50 printfCFine: cannot allocate memory\n"); 

51 exit(-1); 

52 } 
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53 

54 step = ABS(angleEnd - angleStart)/f ineDirections; 
55 

56 maxLength = sqrt({double)(pict->width*pict->width+ 

57 pict->height*pict-> height)); 

58 for(i = 0;i<fineSamples; + +i){ 

59 RandomEdgePixel(pict,&x,&y); 

60 angte = angleStart; 

61 for (j = 0;j <f ineDirections; + +j) { 

62 angle = fmod(angle f 2*M_PI); 

63 x2 = x + maxLength*cos(angle); 

64 y2 = y + maxLength*sin(angle); 

65 counters[jl 4- = CountLine(pict,x,y,(int)x2,(int)y2); 

66 angle + = step; 

67 } 

68 } 
69 

70 angle = angleStart; 

71 maxAngle = angle; 

72 maxValue = counterslO]; 

73 for (i = 0;i<fineDirections; + +i) { 

74 angle = fmod(angle,2*M_PI); 
' 75 if (counters[i]> maxValue) { 

76 maxAngle = angle; 

77 maxValue = countersfi]; 

78 } 

79 angle + = step; 

80 } 

81 printf (''Orientation is at 0 /of(°/of)\n'' ,maxAng!e,maxAngle/2/M_PI*360); 
82 

83 /* Plot the orientation graph if requested */ 

84 printf("Opening fine orientation plot f i!e\n"); 

85 if ((outfile = fopen(plotFile, M w")) = = NULL) { 

86 printfC Error opening fine orientation plot file.W); 

87 exit(-l); 

88 } 

89 angle = angleStart; 

90 for(i = 0;i<fineDirections; + +i) { 

91 angle = fmod(angle,2*M_PI); 

92 f printf (outfile/ , %f%f\n ,, f angle ( counters[i]); 

93 angle + = step; 

94 } 

95 fprintf(outfile,"\ M Fine Distances\n\n"); 

96 fclose(outfile); 

97 printf ("Done writing fine orientation plot file.W); 

98 return maxAngle; 

99 } 
100 

101 main(argc, argv) 

102 intargc; 

103 char*argv[]; 

104 { 

105 char *inFileName f *coarseOutFileName,*fineOutFiIeName,*fine20utFileName; 

106 int f ineDirections,fineSamples; 

107 float coarseAngle,fineAngle,fineAngle2; 
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1 08 float f irstSpacing,secQndSpacing,thirdSpacing; 

109 Picture pict; 
110 

111 if(argc!=7) 

112 { 

113 printfC\nl)sage: ^sinfilecoarsePlotFilefinePlotFileVn^^rgvlO]); 

1 14 printf (" f inerPlotFile #directions #5amples\n\n u ); 
115 

116 exit(0); 

117 } 
118 

1 19 inFileName = argv[1J; /* get args */ 

120 coarseOutFileName = argv[2]; 

121 fineOutFileName = argvI3]; 

122 fine20utFHeName = argv[4]; 

123 fineDirections = atoi(argv[5]); 

124 fineSamples = atoi(argvI6]); 
125 

1 26 pict = load j3ict(inFileName); 

127 coarseAngle = Fine(pict/fineSamples f fineDirections, 

1 28 0,M_PI,coarseOutFiIeName); 

129 firstSpacing = (M_PI-0)/f ine Directions; 

130 printf( M Coarse angle: %f(%f)\n tt ,coarseAngle,coarseAngie/M.PI*180); 

131 printf ("Coarse spacing: %f(%f)\n u ,firstSpacing # first5pacing/M Pl*180); 
132 

133 fineAngle = FinefpicUineSamples^ineDirections, 

134 coarseAngle-4*firstSpacing,coarseAngle+ 4*firstSpacing, 

135 fineOutFileName); 

136 secondSpacing = 8*firstSpacing/fineDirections; 

137 printf ("Fine angle: D /of(%f)\n M ,fineAngle # fineAngle/M M PI*180); 

138 printf("Fine spacing: %f(%f)\n",secondSpacing,secondSpacing/M_PI*180); 
139 

140 fineAngle2 = Fine(pict,fineSamples,fineDirections, 

141 fineAngle-15*second5paclng J fineAngle + 15*secondSpacing t 

142 fine20utFileName); 

143 thirdSpacing = 30*secondSpacing/fineDirections; 

144 printf("Finer angle: %f(%f)\n ,, ,fineAngie2,fineAngte2/M^PI*180); 

145 printffFiner spacing: %f(%f)\n u ,thirdSpadng # thirdSpacing/M PJ*180); 

146 } 
147 
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Section D APPENDIX / Page 267 

Aug 15 06:32 1991 types.c 

1 #indude H stdio.h" 

2 #include "mylib.h" 

3 #include "types.h" 

4 #include "error.h" 
5 

6 Box MakeBox(int xjnt y,int width.int height,double angle) 

7 { 

8 Box temp; 

9 temp = (Box)calloc(1,sizeof(BoxBody)); 

10 if (temp = = NULL) 

11 DoError("MakeBox: out of memory\n",NULL); 

12 temp->x = x; 

13 temp->y = y; 

14 temp->width = width; 

15 temp- > height = height; 

16 temp->angle = angle; 

17 return temp; 

18 } 
19 

20 Point MakePoint(int xjnt y) 

21 { 

22 Point temp; 

23 temp - (Point)calloc(1,sizeof(PointBody)); 

24 if (temp = = NULL) 

25 DoError{"MakePoint: out of memory\n M ,NULL); 

26 temp->x = x; 

27 temp->y = y; 

28 return temp; 

29 } 
30 

31 
32 
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We claim: 

1. A method for electronically processing an electronic 
document image without first decoding the electronic docu- 
ment image, comprising: 

segmenting the document image into word image units 5 
without decoding the document image; 

deriving a word shape representation for each of a plu- 
rality of said word image units without decoding any 
characters making up the plurality of word image units, 10 
thereby deriving a plurality of said word shape repre- 
sentations; 

comparing said word shape representations to at least one 
other word shape representation to identify significant 
word image units from amongst said plurality of word 15 
image units; and 

creating an abbreviated document image that is smaller 
than the electronic document image based on said 
identified significant word image units, said abbrevi- 20 
ated document image including a plurality of said 
identified significant word image units. 

2. The method of claim 1 wherein said step of comparing 
includes classifying said word image units according to 
frequency of occurrence based on comparing said word 25 
shape representations with each other. 

3. The method of claim 1 wherein said step of comparing 
includes classifying said word image units according to 
location within the document image. 30 

4. The method of claim 1 wherein said step of deriving a 
word shape representation includes utilization of at least one 
of an image unit shape dimension, font, typeface, number of 
ascender elements, number of descender elements, pixel 
density, pixel cross-sectional characteristic, the location of 35 
word image units with respect to neighboring word image 
units, vertical position, horizontal interimage unit spacing, 
and contour characteristic of said word image units. 

5. The method of claim 1, wherein said comparing step 40 
includes comparing said word shape representations with 
each other. 

6. The method of claim 1, wherein said comparing step 
includes comparing said word shape representations with at 
least one predetermined word shape representation. 45 

7. The method of claim 1, wherein said comparing step 
includes comparing said word shape representations with at 
least one user-selected word shape representation. 

8. A method of excerpting significant information from an 50 
undecoded document image without decoding the document 
image, comprising: 

segmenting the document image into word image units 
without decoding the document image; 

deriving a word shape representation for each of a plu- 
rality of said word image units without decoding any 
characters making up said plurality of word image 
units, thereby deriving a plurality of said word shape 
representations; 60 

comparing said word shape representations to at least one 
other word shape representation to identify significant 
word image units from amongst said word image units; 

^ 65 

outputting a plurality of said identified significant word 
image units for further processing. 
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9. The method of claim 8 wherein said step of outputting 
a plurality of identified significant image units comprises 
generating a document index based on said significant 
identified word image units. 

10. The method of claim 8 wherein said step of outputting 
a plurality of identified significant image units comprises 
producing a speech synthesized output corresponding to said 
identified significant word image units. 

11. The method of claim 8 wherein said step of outputting 
a plurality of identified significant word image units com- 
prises producing said identified significant word image units 
in printed Braille format. 

12. The method of claim 8 wherein said step of outputting 
said a plurality of identified significant word image units 
comprises generating a document summary from said iden- 
tified significant word image units. 

13. A method for electronically processing an undecoded 
document image containing word text, comprising: 

segmenting the document image into word image units 
without decoding the document image; 

deriving a word shape representation for each of a plu- 
rality of said word image units without decoding any 
characters making up said plurality of word image 
units, thereby deriving a plurality of said word shape 
representations; 

comparing said word shape representations to at least one 
other word shape representation to identify significant 
word image units from amongst said plurality of word 
image units; 

forming phrase image units based on a plurality of said 
identified significant word image units, said phrase 
image units each incorporating one of said identified 
significant word image units and adjacent word image 
units linked in reading order sequence; and 

outputting said phrase image units. 

14. An apparatus for automatically summarizing the infor- 
mation content of an undecoded document image without 
decoding the document image, comprising: 

means for segmenting the document image into word 
image units without decoding the document image; 

means for deriving a word shape representation for each 
of a plurality of said word image units without decod- 
ing any characters making up said plurality of word 
image units, thereby deriving a plurality of said word 
shape representations; 

means for comparing said word shape representations to 
at least one other word shade representation to identify 
significant word image units from amongst said plu- 
rality of word image units; and 

means for creating a supplemental document image based 
on said identified significant word image units. 

15. The apparatus of claim 14 wherein said means for 
segmenting the document image, said means for deriving a 
word shape representation, said means for comparing, said 
means for creating a supplemental document image com- 
prise a programmed digital computer. 

16. The apparatus of claim 15 further comprising scan- 
ning means for scanning an original document to produce 
said document image, said scanning means being incorpo- 
rated in a document copier machine which produces printed 



10/24/2003, EAST 



version: 1.4.1 



5,4S 

575 

document copies; and means for controlling said document 
copier machine to produce a printed document copy of said 
supplemental document image. 

17. The apparatus of claim 15 further comprising scan- 
ning means for scanning an original document to produce 
said document image, said scanning means being incorpo- 
rated in a reading machine for the blind having means for 
communicating data to the user, and means for controlling 
said reading machine communication means to communi- 
cate the contents of said supplemental document image. 

18. The apparatus of claim 17 wherein said communicat- 
ing means comprises a printer for producing document 
copies in Braille format. 
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19. The apparatus of claim 17 wherein said communicat- 
ing means comprises a speech synthesizer for producing 
synthesized speech output corresponding to said supplemen- 

s tal document image. 

20. The apparatus of claim 17 wherein said reading 
machine includes operator responsive means for accessing 
the scanned document or a selected portion thereof corre- 

10 spending to a supplemental document image following 
communication of the supplemental document image to the 
user. 

* * * * * 



10/24/2003, EAST Version: 1.4.1 



